Method for diagnosing the stage of a thyroid tumor

ABSTRACT

The present invention relates to the use of genes differentially expressed in benign thyroid lesions and malignant thyroid lesions for the diagnosis and staging of thyroid cancer.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. patent application Ser. No.11/547,995, filed Dec. 21, 2007, which is a national stage filing of PCTApplication Serial Number PCT/US2005/012289, filed Apr. 11, 2005, whichclaims the benefit under 35 U.S.C. §119(e) of U.S. ProvisionalApplication Ser. No. 60/560,900, filed Apr. 9, 2004, and of U.S.Provisional Application Ser. No. 60/622,643, filed Oct. 26, 2004, all ofwhich are herein incorporated in their entireties by this reference.

TECHNICAL FIELD

The present invention relates to the use of genes differentiallyexpressed in benign thyroid lesions and malignant thyroid lesions forthe diagnosis and staging of thyroid cancer.

BACKGROUND

It is well known that cancer results from changes in gene expressionpatterns that are important for cellular regulatory processes such asgrowth, differentiation, DNA duplication, mismatch repair and apoptosis.It is also becoming more apparent that effective treatment and diagnosisof cancer is dependent upon an understanding of these importantprocesses. Classification of human cancers into distinct groups based ontheir origin and histopathological appearance has historically been thefoundation for diagnosis and treatment. This classification is generallybased on cellular architecture, certain unique cellular characteristicsand cell-specific antigens only. In contrast, gene expression assayshave the potential to identify thousands of unique characteristics foreach tumor type (3) (4). Elucidating a genome wide expression patternfor disease states not only could have a enormous impact on theunderstanding of specific cell biology, but could also provide thenecessary link between molecular genetics and clinical medicine (5) (6)(7).

Thyroid carcinoma represents 1% of all malignant diseases, but 90% ofall neuroendocrine malignancies. It is estimated that 5-10% of thepopulation will develop a clinically significant thyroid nodule duringtheir life-time (8). The best available test in the evaluation of apatient with a thyroid nodule is fine needle aspiration biopsy (FNA)(9). Of the malignant FNAs, the majority are from papillary thyroidcancers (PTC) or its follicular variant (FVPTC). These can be easilydiagnosed if they have the classic cytologic features including abundantcellularity and enlarged nuclei containing intra-nuclear grooves andinclusions (10). Indeed, one third of the time these diagnoses are clearon FNA. Fine needle aspiration biopsy of thyroid nodules has greatlyreduced the need for thyroid surgery and has increased the percentage ofmalignant tumors among excised nodules (11, 12). In addition, thediagnosis of malignant thyroid tumors, combined with effective therapy,has lead to a marked decrease in morbidity due to thyroid cancer.Unfortunately, many thyroid FNAs are not definitively benign ormalignant, yielding an “indeterminate” or “suspicious” diagnosis. Theprevalence of indeterminate FNAs varies, but typically ranges from10-25% of FNAs (13-15). In general, thyroid FNAs are indeterminate dueto overlapping or undefined morphologic criteria for benign versusmalignant lesions, or focal nuclear atypia within otherwise benignspecimens. Of note, twice as many patients are referred for surgery fora suspicious lesion (10%) than for a malignant lesion (5%), anoccurrence that is not widely appreciated since the majority of FNAs arebenign. Therefore when the diagnosis is unclear on FNA these patientsare classified as having a suspicious or indeterminate lesion only. Itis well known that frozen section analysis often yields no additionalinformation.

The question then arises: “Should the surgeon perform a thyroidlobectomy, which is appropriate for benign lesions or a totalthyroidectomy, which is appropriate for malignant lesions when thediagnosis is uncertain both preoperatively and intra-operatively?”Thyroid lobectomy as the initial procedure for every patient with asuspicious FNA could result in the patient with cancer having to undergoa second operation for completion thyroidectomy. Conversely, totalthyroidectomy for all patients with suspicious FNA would result in amajority of patients undergoing an unnecessary surgical procedure,requiring lifelong thyroid hormone replacement and exposure to theinherent risks of surgery (16).

Several attempts to formulate a consensus about classification andtreatment of thyroid carcinoma based on standard histopathologicanalysis have resulted in published guidelines for diagnosis and initialdisease management (2). In the past few decades no improvement has beenmade in the differential diagnosis of thyroid tumors by fine needleaspiration biopsy (FNA), specifically suspicious or indeterminatethyroid lesions, suggesting that a new approach to this should beexplored. Thus, there is a compelling need to develop more accurateinitial diagnostic tests for evaluating a thyroid nodule.

SUMMARY

This invention is based in part on the discovery of genes whoseexpression levels can be correlated to benign or malignant states in athyroid cell. Thus, the present invention provides differentiallyexpressed genes that can be utilized to diagnose, stage and treatthyroid cancer. These differentially expressed genes are collectivelyreferred to herein as “Differentially Expressed Thyroid” genes (“DET”genes). Examples of these DET genes are provided herein and include C21or f4 (DET1), Hs.145049 (DET2), Hs.296031 (DET3), KIT (DET4), LSM7(DET5), SYNGR2 (DET6), C11 or f8 (DET7), CDH11 (DET8), FAM13A1 (DET9),IMPACT (DET10) and KIAA1128 (DET11).

The present invention provides a gene expression approach to diagnosebenign vs malignant thyroid lesions. Identification of differentiallyexpressed genes allows the development of models that can differentiatebenign vs. malignant thyroid tumors. Results obtained from these modelsprovide a molecular classification system for thyroid tumors and this inturn provides a more accurate diagnostic tool for the clinician managingpatients with suspicious thyroid lesions.

The present invention also provides a method for classifying a thyroidlesion in a subject comprising: a) measuring the expression of one ormore nucleic acid sequences selected from the group consisting of DET1,DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET11 in a test cellpopulation, wherein at least one cell in said test cell population iscapable of expressing one or more nucleic acid sequences selected fromthe group consisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9,DET10 and DET11; b) comparing the expression of the nucleic acidsequence(s) to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid lesion classification is known; and c) identifying a difference,if present, in expression levels of one or more nucleic acid sequencesselected from the group consisting of DET1, DET2, DET3, DET4, DET6,DET7, DET8, DET9, DET10 and DET11, in the test cell population andreference cell population, thereby classifying the thyroid lesion in thesubject.

Further provided is a method for classifying a thyroid lesion in asubject comprising: a) measuring the expression of one or more nucleicacid sequences selected from the group consisting of DET1, DET2, DET3,DET4, DET5 and DET6 in a test cell population, wherein at least one cellin said test cell population is capable of expressing one or morenucleic acid sequences selected from the group consisting of DET1, DET2,DET3, DET4, DET5 and DET6; b) comparing the expression of the nucleicacid sequence(s) to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid lesion classification is known; and c) identifying a difference,if present, in expression levels of one or more nucleic acid sequencesselected from the group consisting of DET1, DET2, DET3, DET4, DET5 andDET6, in the test cell population and reference cell population, therebyclassifying the thyroid lesion in the subject.

The present invention also provides a method of identifying the stage ofa thyroid tumor in a subject comprising: a) measuring the expression ofone or more nucleic acid sequences selected from the group consisting ofDET1, DET2, DET3, DET4, DET5 and DET6 in a test cell population, whereinat least one cell in said test cell population is capable of expressingone or more nucleic acid sequences selected from the group consisting ofDET1, DET2, DET3, DET4, DET5 and DET6; b) comparing the expression ofthe nucleic acid sequence(s) to the expression of the nucleic acidsequence(s) in a reference cell population comprising at least one cellfor which a thyroid tumor stage is known; and c) identifying adifference, if present, in expression levels of one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET5 and DET6, in the test cell population and reference cellpopulation, thereby identifying the stage of the thyroid tumor in thesubject.

Further provided by the present invention is a method of identifying thestage of a thyroid tumor in a subject comprising: a) measuring theexpression of one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11 in a test cell population, wherein at least one cell in said testcell population is capable of expressing one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET6, DET7, DET8, DET9, DET10 and DET11; b) comparing the expression ofthe nucleic acid sequence(s) to the expression of the nucleic acidsequence(s) in a reference cell population comprising at least one cellfor which a thyroid tumor stage is known; and c) identifying adifference, if present, in expression levels of one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET6, DET7, DET8, DET9, DET10 and DET11, in the test cell population andreference cell population, thereby identifying the stage of the thyroidtumor in the subject.

Also provided by the present invention is a method of identifying anagent for treating a thyroid tumor, the method comprising: a) contactinga population of thyroid tumor cells from a subject for which a tumorstage is known, wherein at least one cell in said population is capableof expressing one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET5 and DET6, with a test agent;b) measuring the expression of one or more nucleic acid sequencesselected from the group consisting of DET1, DET2, DET3, DET4, DET5 andDET6 in the population; c) comparing the expression of the nucleic acidsequence(s) to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid tumor stage is known; and d) identifying a difference, ifpresent, in expression levels of one or more nucleic acid sequencesselected from the group consisting of DET1, DET2, DET3, DET4, DET5 andDET6, in the test cell population and reference cell population, suchthat if there is a difference corresponding to an improvement, atherapeutic agent for treating a thyroid tumor has been identified.

The present invention also provides a method of identifying an agent fortreating a thyroid tumor, the method comprising: a) contacting apopulation of thyroid tumor cells from a subject for which a tumor stageis known, wherein at least one cell in said population is capable ofexpressing one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11, with a test agent; b) measuring the expression of one or morenucleic acid sequences selected from the group consisting of DET1, DET2,DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET11 in the population;c) comparing the expression of the nucleic acid sequence(s) to theexpression of the nucleic acid sequence(s) in a reference cellpopulation comprising at least one cell for which a thyroid tumor stageis known; and d) identifying a difference, if present, in expressionlevels of one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11, in the test cell population and reference cell population, suchthat if there is a difference corresponding to an improvement, atherapeutic agent for treating a thyroid tumor has been identified.

Also provided by the present invention is a kit comprising one or morereagents for detecting the expression of one or more nucleic acid(s)selected from the group consisting of DET1, DET2, DET3, DET4, DET5,DET6, DET7, DET8, DET9, DET10, DET11.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows PCA (principle component analysis) organization in athree-dimensional space of all samples divided into four groups:hyperplastic-nodule (HN), follicular adenoma (FA), follicular variant ofpapillary thyroid carcinoma (FVPTC) and papillary thyroid carcinoma(PTC). Each dot represents how that sample is localized in space on thebasis of its gene expression profile. The distance between any pair ofpoints is related to the similarity between the two observations in highdimensional space. The principal components are plotted along thevarious axes (x,y,z). The % indicates the total amount of variancecaptured by the PCs; the first PC is the one capturing the largestamount of variance, or information, the second PC, the second largestetc. Three PCs were plotted, thus creating a 3D plot.

FIG. 2 shows PCA organization in a three-dimensional space of allsamples divided into two groups: benign (HN-FA) and malignant(FVPTC-PTC). Each dot represents how that sample is localized in-spaceon the basis of its gene expression profile. The distance between anypair of points is related to the similarity between the two observationsin high dimensional space.

FIG. 3 shows PCA organization in a three-dimensional space of allsamples with (A) and without the unknowns (B) based on the geneexpressions values of the six most informative genes. It is clear thereis a separation of the two groups and that it is possible to predictvisually the diagnosis of each unknown. The pathological diagnoses ofthe unknowns are marked respectively with a + and a * for the benign andthe malignant tumor. The red + sign indicates an unknown sample forwhich pathological diagnosis and predicted diagnosis were discordant.Based on our six gene diagnostic predictor model, this lesion was placedin the malignant group. Upon re-review by the pathologist, this samplewas reclassified from benign to a neoplasm of uncertain malignantpotential.

FIG. 4 is a graph showing gene expression profiles of ten unknownsamples. On the basis of their profile the predictor model of thisinvention gave a correct diagnosis in 100% of the cases. The y axisrepresents the ratio between thyroid tumor mRNA expression level (Cy5fluorescence intensity) and control thyroid tissue mRNA expression level(Cy3 fluorescence intensity).

FIG. 5 shows the results of RT-PCR utilizing the 6 gene predictor model.The RT-PCR data using 6 gene's across 42 patient samples demonstratesseparation by group.

FIG. 6 shows immunohistochemical results for expression of KIT and CDH1in malignant and benign thyroid lesions. These results correlate withthe expression data obtained via microarray and RT-PCR.

FIG. 7 shows the results of RT-PCR utilizing the 10 gene predictormodel. The RT-PCR data using 10 genes demonstrates separation by group.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS DifferentiallyExpressed Thyroid Genes

One aspect of the invention relates to genes that are differentiallyexpressed in benign and/or malignant thyroid lesions relative to normalthyroid tissue. These differentially expressed genes are collectivelyreferred to herein as “Differentially Expressed Thyroid” genes (“DET”genes). The corresponding gene products are referred to as “DETproducts” “DET polypeptides” and/or “DET proteins”. The DET genes of thepresent invention include C21 or f4, Hs.145049, Hs.296031, KIT, LSM7,SYNGR2, C11 or f8, CDH1, FAM13A1, IMPACT and KIAA1128. The followingprovides a brief description of each DET gene provided herein.

C21 or f4 (DET1)

C21 or f4 is a gene encoding an integral membrane protein of unknownfunction, located in the q region of chromosome 21. C21 or f4 was foundto be upregulated in benign thyroid lesions and upregulated in malignantthyroid lesions as compared to normal thyroid tissue. Upon comparingbenign tissue with malignant tissue, C21 or f4 was found to beupregulated in benign tissue as compared to malignant tissue. An exampleof a nucleic acid encoding C21 or f4 is set forth herein as SEQ ID NO:40. Nucleic acid sequences for C21 or f4 can also be accessed viaGenBank Accession No. AP001717, GenBank Accession No. NM.sub.—006134 andvia Unigene No. Hs.433668. All of the information, including any nucleicacid and amino acids sequences provided for C21 or f4 under GenBankAccession No. AP001717, GenBank Accession No. NM.sub.—006134 and UnigeneNo. Hs.433668 is hereby incorporated in its entirety by this reference.

Hs.145049 (DET2)

Hs. 145049, formerly known as Hs.24183, is a sodium-D-glucosetransporter. The Unigene cluster identified as Unigene NO. Hs. 24183 hasbeen retired and has been replaced by Hs. 145049. Hs. 145049 was foundto be upregulated in both benign and malignant thyroid lesions ascompared to normal thyroid tissue. Upon comparing benign tissue withmalignant tissue, Hs.145049 was found to be upregulated in benign tissueas compared to malignant tissue. A nucleic acid encoding Hs. 145049 isset forth herein as SEQ ID NO: 42. Nucleic acid sequences for Hs.145049can also be accessed via GenBank Accession No. NP.sub.—060265, viaGenBank Accession No. AL832414.1 and via Unigene No. Hs.145049. All ofthe information, including any nucleic acid and amino acids sequencesprovided for Hs.145049 under GenBank Accession NP.sub.—060265, viaGenBank Accession No. AL832414 and via Unigene No. Hs.145049 is herebyincorporated in its entirety by this reference.

Hs.296031 (DET3)

Hs.296031 is a gene of unknown function. Hs. 296031 was found to bedownregulated in benign and comparable to normal in malignant thyroidlesions as compared to normal thyroid tissue. Upon comparing benigntissue with malignant tissue, Hs.296031 was found to be upregulated inmalignant tissue as compared to benign tissue. A nucleic acid encodingHs. 296031 is set forth herein as SEQ ID NO: 44. Nucleic acid sequencesfor Hs.296031 can also be accessed via GenBank Accession No. BC038512and via Unigene No. Hs.296031. All of the information, including anynucleic acid and amino acids sequences provided for Hs.296031 underGenBank Accession No. BC038512 and Unigene No. Hs.296031 is herebyincorporated in its entirety by this reference.

c-kit Proto-Oncogene (KIT) (DET4)

KIT is a protooncogene that functions as a transmembrane receptortyrosine kinase and is involved in cellular proliferation. See Yarden etal. “Human proto-oncogene c-kit: a new cell surface receptor tyrosinekinase for an unidentified ligand” EMBO J. 6(11): 3341-3351 (1987). TheYarden et al. reference is incorporated herein in its entirety for thepurpose of describing KIT function as well as for incorporating all KITprotein sequences and nucleic acids encoding KIT provided in the Yardenet al. reference. KIT was found to be downregulated in benign thyroidlesions and downregulated in malignant thyroid lesions as compared tonormal thyroid tissue. Upon comparing benign tissue with malignanttissue, KIT was found to be upregulated in benign tissue as compared tomalignant tissue. Thus, KIT expression decreases during malignancy. Anucleic acid encoding KIT is set forth herein as SEQ ID NO: 45. Nucleicacid sequences for KIT can also be accessed via GenBank Accession Nos.X06182 and NM.sub.—000222 and via Unigene No. Hs.81665. All of theinformation, including any nucleic acid and amino acids sequencesprovided for KIT under GenBank Accession No. X06182, GenBank AccessionNo. NM.sub.—000222 and via Unigene No. Hs.81665 is hereby incorporatedin its entirety by this reference.

U6 Small Nuclear RNA Associated Homo sapiens LSM7 Homolog (LSM7) (DET5)

LSM7 is a U6 small nuclear ribonucleoprotein that is involved in tRNAprocessing. LSM7 was found to be upregulated in benign thyroid lesionsand downregulated in malignant thyroid lesions as compared to normalthyroid tissue. Upon comparing benign tissue with malignant tissue,LSM-7 was found to be upregulated in benign tissue as compared tomalignant tissue. A nucleic acid sequence encoding LSM7 is set forthherein as SEQ ID NO: 47. Nucleic acid sequences for LSM7 can also beaccessed via GenBank Accession No. NM.sub.—016199 and via Unigene No.Hs.512610. All of the information, including any nucleic acid and aminoacids sequences provided for LSM7 under GenBank Accession No.NM.sub.—016199 and Unigene No. Hs.512610 is hereby incorporated in itsentirety by this reference.

Synaptogyrin 2 (SYNGR2) (DET6)

SYNGR2 is a synaptic vesicle protein that may play a role in regulatingmembrane traffic. SYNGR2 was found to be downregulated in benign thyroidlesions and comparable to normal in malignant thyroid lesions ascompared to normal thyroid tissue. Upon comparing benign tissue withmalignant tissue, SYNGR2 was found to be upregulated in malignant tissueas compared to benign tissue. A nucleic acid encoding SYNG2 is set forthherein as SEQ ID NO: 49. Nucleic acid sequences for SYNGR2 can also beaccessed via GenBank Accession No. NM.sub.—004710 and via Unigene No.Hs. 433753. All of the information, including any nucleic acid and aminoacids sequences provided for LSM7 under GenBank Accession No.NM.sub.—004710 and via Unigene No. Hs. 433753 is hereby incorporated inits entirety by this reference.

C11 or f8 (DET7)

C11 or f8 is a gene involved in central nervous system development andfunction. C11 or f8 was found to be downregulated in both benign thyroidlesions and malignant thyroid lesions as compared to normal thyroidtissue. Upon comparing benign tissue with malignant tissue, C11 or f8was found to be upregulated in benign tissue as compared to malignanttissue. A nucleic acid encoding C11 or f8 is set forth herein as SEQ IDNO: 51. Nucleic acid sequences for C11 or f8 can also be accessed viaGenBank Accession No. NM.sub.—001584 and via Unigene No. Hs. 432000. Allof the information, including any nucleic acid and amino acids sequencesprovided for LSM7 under GenBank Accession No. NM.sub.—001584 and UnigeneNo. Hs. 432000 is hereby incorporated in its entirety by this reference.

Cadherin 1, Type 1, E-Cadherin (CDH1) (DET8)

CDH1 is a cadherin protein involved in cell adhesion, motility, growthand proliferation. CDH1 was found to be upregulated in benign thyroidlesions and downregulated in malignant thyroid lesions as compared tonormal thyroid tissue. Upon comparing benign tissue with malignanttissue, CDH1 was found to be upregulated in benign tissue as compared tomalignant tissue. A nucleic acid encoding CDH1 is set forth herein asSEQ ID NO: 53. Nucleic acid sequences for CDH1 can also be accessed viaGenBank Accession No. NM.sub.—004360 and via Unigene No. Hs. 194657. Allof the information, including any nucleic acid and amino acids sequencesprovided for CDH1 under GenBank Accession No. NM.sub.—004360 and UnigeneNo. Hs. 194657 is hereby incorporated in its entirety by this reference.

Homo sapiens Family with Sequence Similarity 13, Member A1 (FAM13A1)(DET9)

FAM13A1 is a gene of unknown function. FAM13A1 was found to beupregulated in benign thyroid lesions and downregulated in malignantthyroid lesions as compared to normal thyroid tissue. Upon comparingbenign tissue with malignant tissue, FAM13A1 was found to be upregulatedin benign tissue as compared to malignant tissue. A nucleic acidencoding FAM13A1 is set forth herein as SEQ ID NO: 55. Nucleic acidsequences for FAM13A1 can also be accessed via GenBank Accession No.NM.sub.—014883 and via Unigene No. Hs. 442818. All of the information,including any nucleic acid and amino acids sequences provided forFAM13A1 under GenBank Accession No. NM.sub.—014883 and Unigene No. Hs.442818 is hereby incorporated in its entirety by this reference.

Homo sapiens Hypothetical Protein IMPACT (IMPACT) (DET10)

IMPACT is a gene of unknown function. IMPACT was found to be upregulatedin benign thyroid lesions and downregulated in malignant thyroid lesionsas compared to normal thyroid tissue. Upon comparing benign tissue withmalignant tissue, IMPACT was found to be upregulated in benign tissue ascompared to malignant tissue. A nucleic acid encoding IMPACT is setforth herein as SEQ ID NO: 57. Nucleic acid sequences for IMPACT canalso be accessed via GenBank Accession No. NM.sub.—018439 and viaUnigene No. Hs. 284245. All of the information, including any nucleicacid and amino acids sequences provided for IMPACT under GenBankAccession No. NM.sub.—018439 and Unigene No. Hs. 284245 is herebyincorporated in its entirety by this reference.

KIAA1128 Protein (KIAA1128) (DET 11)

KIAA1128 is a gene of unknown function. KIAA 1128 was found to beupregulated in benign thyroid lesions and downregulated in malignantthyroid lesions as compared to normal thyroid tissue. Upon comparingbenign tissue with malignant tissue, KIAA1128 was found to beupregulated in benign tissue as compared to malignant tissue. A nucleicacid encoding KIAA1128 is set forth herein as SEQ ID NO: 59. Nucleicacid sequences for KIAA1128 can also be accessed via GenBank AccessionNos. AB032954 and via Unigene No. Hs. 81897. All of the information,including any nucleic acid and amino acids sequences provided forKIAA1128 under GenBank Accession Nos. AB032954 and via Unigene No. Hs.81897 is hereby incorporated in its entirety by this reference.

Diagnostic Methods

The present invention provides a method for classifying a thyroid lesionin a subject comprising: a) measuring the expression of one or morenucleic acid sequences selected from the group consisting of DET1, DET2,DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET11 in a test cellpopulation, wherein at least one cell in said test cell population iscapable of expressing one or more nucleic acid sequences selected fromthe group consisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9,DET10 and DET11; b) comparing the expression of the nucleic acidsequence(s) to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid lesion classification is known; and c) identifying a difference,if present, in expression levels of one or more nucleic acid sequencesselected from the group consisting of DET1, DET2, DET3, DET4, DET6,DET7, DET8, DET9, DET10 and DET11, in the test cell population andreference cell population, thereby classifying the thyroid lesion in thesubject.

The present invention also provides a method for classifying a thyroidlesion in a subject comprising: a) measuring the expression of one ormore nucleic acid sequences selected from the group consisting of DET1,DET2, DET3, DET4, DET5 and DET6 in a test cell population, wherein atleast one cell in said test cell population is capable of expressing oneor more nucleic acid sequences selected from the group consisting ofDET1, DET2, DET3, DET4, DET5 and DET6; b) comparing the expression ofthe nucleic acid sequence(s) to the expression of the nucleic acidsequence(s) in a reference cell population comprising at least one cellfor which a thyroid lesion classification is known; and c) identifying adifference, if present, in expression levels of one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET5 and DET6, in the test cell population and reference cellpopulation, thereby classifying the thyroid lesion in the subject.

In the methods of the present invention, “classifying a thyroid lesion”is equivalent to diagnosing a subject with a type of thyroid lesion.These lesions can be benign or malignant. Examples of a benign lesioninclude, but are not limited to, follicular adenoma, hyperplasticnodule, papillary adenoma, thyroiditis nodule and multinodular goiter.Examples of malignant lesions include, but are not limited to, papillarythyroid carcinoma, follicular variant of papillary thyroid carcinoma,follicular carcinoma, Hurthle cell tumor, anaplastic thyroid cancer,medullary thyroid cancer, thyroid lymphoma, poorly differentiatedthyroid cancer and thyroid angiosarcoma.

In the methods of the present invention, measuring the expression levelsof one or more nucleic acids sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11, means that the expression of any combination of these sequencescan be measured. For example, the expression level of one, two, three,four, five, six, seven, eight, nine or ten sequences selected from thegroup consisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9,DET10 and DET11 can be measured. Similarly, when measuring theexpression levels of one or more nucleic acid sequences selected fromthe group consisting of DET1, DET2, DET3, DET4, DET5 and DET6, one ofskill in the art can measure the expression level of one, two, three,four, five or six sequences selected from the group consisting of DET1,DET2, DET3, DET4, DET5 and DET6.

In the methods of the present invention, the invention includesproviding a test population which includes at least once cell that iscapable of expressing one or more of the sequences DET1-11. As utilizedherein, “expression” refers to the transcription of a DET gene to yielda DET nucleic acid, such as a DET mRNA. The term “expression” alsorefers to the transcription and translation of a DET gene to yield theencoded protein, in particular a DET protein or a fragment thereof.Therefore, one of skill in the art can detect the expression of a DETgene by monitoring DET nucleic acid production and/or expression of theDET protein. As utilized herein, “upregulated” refers to an increase inexpression and “downregulated” refers to a decrease in expression.

In the methods of the present invention, the reference cell populationcan be from normal thyroid tissue, cancerous thyroid tissue or any othertype of thyroid tissue for which a classification is known. As usedherein, “a cell of a normal subject” or “normal thyroid tissue” means acell or tissue which is histologically normal and was obtained from asubject believed to be without malignancy and having no increased riskof developing a malignancy or was obtained from tissues adjacent totissue known to be malignant and which is determined to behistologically normal (non-malignant) as determined by a pathologist.

Using the sequence information provided herein and the sequencesprovided by the database entries, the expression of the DET sequences orfragments thereof can be detected, if present, and measured usingtechniques well known in the art. For example, sequences disclosedherein can be used to construct probes for detecting DET DNA and RNAsequences. The amount of a DET nucleic acid, for example, DET mRNA, in acell can be determined by methods standard in the art for detecting orquantitating nucleic acid in a cell, such as in situ hybridization,quantitative PCR, Northern blotting, ELISPOT, dot blotting, etc., aswell as any other method now known or later developed for detecting orquantitating the amount of a nucleic acid in a cell.

The presence or amount of a DET protein in or produced by a cell can bedetermined by methods standard in the art, such as Western blotting,ELISA, ELISPOT, immunoprecipitation, immunofluorescence (e.g., FACS),immunohistochemistry, immunocytochemistry, etc., as well as any othermethod now known or later developed for detecting or quantitatingprotein in or produced by a cell.

As used throughout, by “subject” is meant an individual. Preferably, thesubject is a mammal such as a primate, and, more preferably, a human.The term “subject” includes domesticated animals, such as cats, dogs,etc., livestock (e.g., cattle, horses, pigs, sheep, goats, etc.), andlaboratory animals (e.g., mouse, monkey, rabbit, rat, guinea pig, etc.).

The present invention also provides for detection of variants of the DETnucleic acids and polypeptides disclosed herein. In general, variants ofnucleic acids and polypeptides herein disclosed typically have at least,about 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85,86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99 percenthomology to the stated sequence or the native sequence. Those of skillin the art readily understand how to determine the homology of twopolypeptides or nucleic acids. For example, the homology can becalculated after aligning the two sequences so that the homology is atits highest level.

Another way of calculating homology can be performed by publishedalgorithms. Optimal alignment of sequences for comparison may beconducted by the local homology algorithm of Smith and Waterman Adv.Appl. Math. 2: 482 (1981), by the homology alignment algorithm ofNeedleman and Wunsch, J. Mol. Biol. 48: 443 (1970), by the search forsimilarity method of Pearson and Lipman, Proc. Natl. Acad. Sci. U.S.A.85: 2444 (1988), by computerized implementations of these algorithms(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics SoftwarePackage, Genetics Computer Group, 575 Science Dr., Madison, Wis.; theBLAST algorithm of Tatusova and Madden FEMS Microbiol. Lett. 174:247-250 (1999) available from the National Center for BiotechnologyInformation or by inspection. Similarly, the present invention providesfor the detection of DET proteins that are homologues of human DETproteins in other species. It would be readily apparent to one of skillin the art that the DET sequences set forth herein and in GenBank can beutilized in sequence comparisons to identify DET sequences in otherspecies.

The sample of this invention, such as a test cell population or areference cell population, can be from any organism and can be, but isnot limited to, peripheral blood, bone marrow specimens, primary tumors,embedded tissue sections, frozen tissue sections, cell preparations,cytological preparations, exfoliate samples (e.g., sputum), fine needleaspirations, lung fluid, amnion cells, fresh tissue, dry tissue, andcultured cells or tissue. The sample can be from malignant tissue ornon-malignant tissue. The sample can be unfixed or fixed according tostandard protocols widely available in the art and can also be embeddedin a suitable medium for preparation of the sample. For example, thesample can be embedded in paraffin or other suitable medium (e.g., epoxyor acrylamide) to facilitate preparation of the biological specimen forthe detection methods of this invention. Furthermore, the sample can beembedded in any commercially available mounting medium, either aqueousor organic.

The sample can be on, supported by, or attached to, a substrate whichfacilitates detection. A substrate of the present invention can be, butis not limited to, a microscope slide, a culture dish, a culture flask,a culture plate, a culture chamber, ELISA plates, as well as any othersubstrate that can be used for containing or supporting biologicalsamples for analysis according to the methods of the present invention.The substrate can be of any material suitable for the purposes of thisinvention, such as, for example, glass, plastic, polystyrene, mica andthe like. The substrates of the present invention can be obtained fromcommercial sources or prepared according to standard procedures wellknown in the art.

Conversely, an antibody or fragment thereof, an antigenic fragment of aDET protein, or DET nucleic acid of the invention can be on, supportedby, or attached to a substrate which facilitates detection. Such asubstrate can include a chip, a microarray or a mobile solid support.Thus, provided by the invention are substrates including one or more ofthe antibodies or antibody fragments, antigenic fragments of DETproteins, or DET nucleic acids of the invention.

The nucleic acids of this invention can be detected with a probe capableof hybridizing to the nucleic acid of a cell or a sample. This probe canbe a nucleic acid comprising the nucleotide sequence of a coding strandor its complementary strand or the nucleotide sequence of a sense strandor antisense strand, or a fragment thereof. The nucleic acid cancomprise the nucleic acid of a DET gene or fragments thereof. Thus, theprobe of this invention can be either DNA or RNA and can bind either DNAor RNA, or both, in the biological sample. The probe can be the codingor complementary strand of a complete DET gene or DET gene fragment.

The nucleic acids of the present invention, for example, DET1-DET11nucleic acids and fragments thereof, can be utilized as probes orprimers to detect DET nucleic acids. Therefore, the present inventionprovides DET polynucleotide probes or primers that can be at least 15,25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105,110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175,180, 185, 190, 195, 200, 250, 300, 350 or at least 400 nucleotides inlength.

As used herein, the term “nucleic acid probe” refers to a nucleic acidfragment that selectively hybridizes under stringent conditions with anucleic acid comprising a nucleic acid set forth in a DET sequenceprovided herein. This hybridization must be specific. The degree ofcomplementarity between the hybridizing nucleic acid and the sequence towhich it hybridizes should be at least enough to exclude hybridizationwith a nucleic acid encoding an unrelated protein.

Stringent conditions refers to the washing conditions used in ahybridization protocol. In general, the washing conditions should be acombination of temperature and salt concentration chosen so that thedenaturation temperature is approximately 5-20.degree. C. below thecalculated T.sub.m of the nucleic acid hybrid under study. Thetemperature and salt conditions are readily determined empirically inpreliminary experiments in which samples of reference DNA immobilized onfilters are hybridized to the probe or protein coding nucleic acid ofinterest and then washed under conditions of different stringencies. TheT.sub.m of such an oligonucleotide can be estimated by allowing2.degree. C. for each A or T nucleotide, and 4.degree. C. for each G orC. For example, an 18 nucleotide probe of 50% G+C would, therefore, havean approximate T.sub.m of 54.degree. C.

Stringent conditions are known to one of skill in the art. See, forexample, Sambrook et al. (2001). An example of stringent wash conditionsis 4.times.SSC at 65.degree. C. Highly stringent wash conditionsinclude, for example, 0.2.times.SSC at 65.degree. C.

As mentioned above, the DET nucleic acids and fragments thereof can beutilized as primers to amplify a DET nucleic acid, such as a DET genetranscript, by standard amplification techniques. For example,expression of a DET gene transcript can be quantified by RT-PCR usingRNA isolated from cells, as described in the Examples.

A variety of PCR techniques are familiar to those skilled in the art.For a review of PCR technology, see White (1997) and the publicationentitled “PCR Methods and Applications” (1991, Cold Spring HarborLaboratory Press), which is incorporated herein by reference in itsentirety for amplification methods. In each of these PCR procedures, PCRprimers on either side of the nucleic acid sequences to be amplified areadded to a suitably prepared nucleic acid sample along with dNTPs and athermostable polymerase such as Taq polymerase, Pfu polymerase, or Ventpolymerase. The nucleic acid in the sample is denatured and the PCRprimers are specifically hybridized to complementary nucleic acidsequences in the sample. The hybridized primers are extended.Thereafter, another cycle of denaturation, hybridization, and extensionis initiated. The cycles are repeated multiple times to produce anamplified fragment containing the nucleic acid sequence between theprimer sites. PCR has further been described in several patentsincluding U.S. Pat. Nos. 4,683,195, 4,683,202 and 4,965,188. Each ofthese publications is incorporated herein by reference in its entiretyfor PCR methods. One of skill in the art would know how to design andsynthesize primers that amplify a DET sequence or a fragment thereof.

A detectable label may be included in an amplification reaction.Suitable labels include fluorochromes, e.g. fluorescein isothiocyanate(FITC), rhodamine, Texas Red, phycoerythrin, allophycocyanin,6-carboxyfluorescein (6-FAM),2′,7′-dimethoxy-4′,5′-dichloro-6-carboxyfluorescein (JOE),6-carboxy-X-rhodamine (ROX),6-carboxy-2′,4′,7′,4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein(5-FAM) or N,N,N′,N′-tetramethyl-6-carboxyrhodamine (TAMRA), radioactivelabels, e.g., .sup.32P, .sup.35S, .sup.3H; etc. The label may be a twostage system, where the amplified DNA is conjugated to biotin, haptens,etc. having a high affinity binding partner, e.g. avidin, specificantibodies, etc., where the binding partner is conjugated to adetectable label. The label may be conjugated to one or both of theprimers. Alternatively, the pool of nucleotides used in theamplification is labeled, so as to incorporate the label into theamplification product. The amplification reaction can also include adual fluorescent probe, as described in the Examples, which hybridizesto and detects the amplification product thus allowing real timequantitation of the amplification product.

Therefore, expression of the nucleic acid(s) of the present inventioncan be measured by amplifying the nucleic acid(s) and detecting theamplified nucleic acid with a fluorescent probe.

For example, DET1 can be amplified utilizing forward primerGCAATCCTCTTACCTCCGCTTT (SEQ ID NO: 7) and reverse primerGGAATCGGAGACAGAAGAGAGCTT (SEQ ID NO: 8). The nucleic acid amplified bythese primers can be detected with a probe comprising the nucleic acidsequence CTGGGACCACAGATGTATCCTCCACTCC (SEQ ID NO: 9) linked to afluorescent label. These primers are merely exemplary for theamplification of DET1 as one of skill in the art would know how todesign primers, based on the DET1 nucleic acid sequences providedherein, such as SEQ ID NO: 40 and the nucleic acid sequences provided bythe database entries, to amplify a DET1 nucleic acid. Similarly, theprobe sequences provided herein are merely exemplary for the detectionof a DET1 nucleic acid, as one of skill in the art would know how todesign a probe, based on the DET1 nucleic acid sequences providedherein, such as SEQ ID NO: 40 and the nucleic acid sequences provided bythe database entries, to detect a DET2 nucleic acid.

DET2 can be amplified utilizing forward primer GGCTGACTGGCAAAAAGTCTTG(SEQ ID NO: 1) and reverse primer TTGGTTCCCTTAAGTTCTCAGAGTTT (SEQ ID NO:2). The nucleic acid amplified by these primers can be detected with aprobe comprising the nucleic acid sequence TGGCCCTGTCACTCCCATGATGC (SEQID NO: 3) linked to a fluorescent label. These primers are merelyexemplary for the amplification of DET2 as one of skill in the art wouldknow how to design primers, based on the DET2 nucleic acid sequencesprovided herein, such as SEQ ID NO: 42 and the nucleic acid sequencesprovided by the database entries, to amplify a DET2 nucleic acid.Similarly, the probe sequences provided herein are merely exemplary forthe detection of a DET2 nucleic acid, as one of skill in the art wouldknow how to design a probe, based on the DET2 nucleic acid sequencesprovided herein, such as SEQ ID NO: 42 and the nucleic acid sequencesprovided by the database entries, to detect a DET2 nucleic acid.

DET3 can be amplified utilizing forward primer TGCCAAGGAGCTTTGTTTATAGAA(SEQ ID NO: 19) and reverse primer ATGACGGCATGTACCAACCA (SEQ ID NO: 20).The nucleic acid amplified by these primers can be detected with a probecomprising the nucleic acid sequence TTGGTCCCCTCAGTTCTATGCTGTTGTGT (SEQID NO: 21) linked to a fluorescent label. These primers are merelyexemplary for the amplification of DET3 as one of skill in the art wouldknow how to design primers, based on the DET3 nucleic acid sequencesprovided herein, such as SEQ ID NO: 44 and the nucleic acid sequencesprovided by the database entries, to amplify a DET3 nucleic acid.Similarly, the probe sequences provided herein are merely exemplary forthe detection of a DET3 nucleic acid, as one of skill in the art wouldknow how to design a probe, based on the DET3 nucleic acid sequencesprovided herein, such as SEQ ID NO: 44 and the nucleic acid sequencesprovided by the database entries, to detect a DET3 nucleic acid.

DET4 can be amplified utilizing forward primerGCACCTGCTGAAATGTATGACATAAT (SEQ ID NO: 22) and reverse primerTTTGCTAAGTTGGAGTAAATATGATTGG (SEQ ID NO: 23). The nucleic acid amplifiedby these primers can be detected with a probe comprising the nucleicacid sequence ATTGTTCAGCTAATTGAGAAGCAGATTTCAGAGAGC (SEQ ID NO: 24)linked to a fluorescent label. These primers are merely exemplary forthe amplification of DET4 as one of skill in the art would know how todesign primers, based on the DET4 nucleic acid sequences providedherein, such as SEQ ID NO: 45 and the nucleic acid sequences provided bythe database entries, to amplify a DET4 nucleic acid. Similarly, theprobe sequences provided herein are merely exemplary for the detectionof a DET4 nucleic acid, as one of skill in the art would know how todesign a probe, based on the DET4 nucleic acid sequences providedherein, such as SEQ ID NO: 45 and the nucleic acid sequences provided bythe database entries, to detect a DET4 nucleic acid.

DET5 can be amplified utilizing forward primer GACGATCCGGGTAAAGTTCCA(SEQ ID NO: 34) and reverse primer AGGTTGAGGAGTGGGTCGAA (SEQ ID NO: 35)The nucleic acid amplified by these primers can be detected with a probecomprising the nucleic acid sequence AGGCCGCGAAGCCAGTGGAATC (SEQ ID NO:36) linked to a fluorescent label. These primers are merely exemplaryfor the amplification of DET5 as one of skill in the art would know howto design primers, based on the DET5 nucleic acid sequences providedherein, such as SEQ ID NO: 47 and the nucleic acid sequences provided bythe database entries, to amplify a DET5 nucleic acid. Similarly, theprobe sequences provided herein are merely exemplary for the detectionof a DET5 nucleic acid, as one of skill in the art would know how todesign a probe, based on the DET5 nucleic acid sequences providedherein, such as SEQ ID NO: 47 and the nucleic acid sequences provided bythe database entries, to detect a DET5 nucleic acid.

DET6 can be amplified utilizing forward primer GCTGGTGCTCATGGCACTT (SEQID NO: 31) and reverse primer CCCTCCCCAGGCTTCCTAA (SEQ ID NO: 32). Thenucleic acid amplified by these primers can be detected with a probecomprising the nucleic acid sequence AAGGGCTTTGCCTGACAACACCCA (SEQ IDNO: 33) linked to a fluorescent label. These primers are merelyexemplary for the amplification of DET6 as one of skill in the art wouldknow how to design primers, based on the DET6 nucleic acid sequencesprovided herein, such as SEQ ID NO: 49 and the nucleic acid sequencesprovided by the database entries, to amplify a DET6 nucleic acid.Similarly, the probe sequences provided herein are merely exemplary forthe detection of a DET6 nucleic acid, as one of skill in the art wouldknow how to design a probe, based on the DET6 nucleic acid sequencesprovided herein, such as SEQ ID NO: 49 and the nucleic acid sequencesprovided by the database entries, to detect a DET6 nucleic acid.

DET7 can be amplified utilizing forward primer CCGGCCCAAGCTCCAT (SEQ IDNO: 13) and reverse primer TTGTGTAACCGTCGGTCATGA (SEQ ID NO: 14). Thenucleic acid amplified by these primers can be detected with a probecomprising the nucleic acid sequence TGTTTGGTGGAATCCATGAAGGTTATGGC (SEQID NO: 15) linked to a fluorescent label. These primers are merelyexemplary for the amplification of DET7 as one of skill in the art wouldknow how to design primers, based on the DET7 nucleic acid sequencesprovided herein, such as SEQ ID NO: 51 and the nucleic acid sequencesprovided by the database entries, to amplify a DET7 nucleic acid.Similarly, the probe sequences provided herein are merely exemplary forthe detection of a DET7 nucleic acid, as one of skill in the art wouldknow how to design a probe, based on the DET7 nucleic acid sequencesprovided herein, such as SEQ ID NO: 51 and the nucleic acid sequencesprovided by the database entries, to detect a DET7 nucleic acid.

DET8 can be amplified utilizing forward primer TGAGTGTCCCCCGGTATCTTC(SEQ ID NO: 28) and reverse primer CAGCCGCTTTCAGATTTTCAT (SEQ ID NO:29). The nucleic acid amplified by these primers can be detected with aprobe comprising the nucleic acid sequence CCTGCCAATCCCGATGAAATTGGAAAT(SEQ ID NO: 30) linked to a fluorescent label. These primers are merelyexemplary for the amplification of DET8 as one of skill in the art wouldknow how to design primers, based on the DET8 nucleic acid sequencesprovided herein, such as SEQ ID NO: 53 and the nucleic acid sequencesprovided by the database entries, to amplify a DET8 nucleic acid.Similarly, the probe sequences provided herein are merely exemplary forthe detection of a DET8 nucleic acid, as one of skill in the art wouldknow how to design a probe, based on the DET8 nucleic acid sequencesprovided herein, such as SEQ ID NO: 53 and the nucleic acid sequencesprovided by the database entries, to detect a DET8 nucleic acid.

DET9 can be amplified utilizing forward primer ATGGCAGTGCAGTCATCATCTT(SEQ ID NO: 10) and reverse primer GCATTCATACAGCTGCTTACCATCT (SEQ ID NO:11). The nucleic acid amplified by these primers can be detected with aprobe comprising the nucleic acid sequence TTTGGTCCCTGCCTAGGACCGGG (SEQID NO: 12) linked to a fluorescent label. These primers are merelyexemplary for the amplification of DET9 as one of skill in the art wouldknow how to design primers, based on the DET9 nucleic acid sequencesprovided herein, such as SEQ ID NO: 55 and the nucleic acid sequencesprovided by the database entries, to amplify a DET9 nucleic acid.Similarly, the probe sequences provided herein are merely exemplary forthe detection of a DET9 nucleic acid, as one of skill in the art wouldknow how to design a probe, based on the DET9 nucleic acid sequencesprovided herein, such as SEQ ID NO: 55 and the nucleic acid sequencesprovided by the database entries, to detect a DET9 nucleic acid.

DET10 can be amplified utilizing forward primerTGAAGAATGTCATGGTGGTAGTATCA (SEQ ID NO: 25) and reverse primerATGACTCCTCAGGTGAATTTGTGTAG (SEQ ID NO: 26). The nucleic acid amplifiedby these primers can be detected with a probe comprising the nucleicacid sequence CTGGTATGGAGGGATTCTGCTAGGACCAG (SEQ ID NO: 27) linked to afluorescent label. These primers are merely exemplary for theamplification of DET10 as one of skill in the art would know how todesign primers, based on the DET10 nucleic acid sequences providedherein, such as SEQ ID NO: 57 and the nucleic acid sequences provided bythe database entries, to amplify a DET10 nucleic acid. Similarly, theprobe sequences provided herein are merely exemplary for the detectionof a DET10 nucleic acid, as one of skill in the art would know how todesign a probe, based on the DET10 nucleic acid sequences providedherein, such as SEQ ID NO: 57 and the nucleic acid sequences provided bythe database entries, to detect a DET10 nucleic acid.

DET11 can be amplified utilizing forward primer GAGAGCGTGATCCCCCTACA(SEQ ID NO: 16) and reverse primer ACCAAGAGTGCACCTCAGTGTCT (SEQ ID NO:17). The nucleic acid amplified by these primers can be detected with aprobe comprising the nucleic acid sequenceTCACTTCCAAATGTTCCTGTAGCATAAATGGTG (SEQ ID NO: 18) linked to afluorescent label. These primers are merely exemplary for theamplification of DET11 as one of skill in the art would know how todesign primers, based on the DET11 nucleic acid sequences providedherein, such as SEQ ID NO: 59 and the nucleic acid sequences provided bythe database entries, to amplify a DET11 nucleic acid. Similarly, theprobe sequences provided herein are merely exemplary for the detectionof a DET11 nucleic acid, as one of skill in the art would know how todesign a probe, based on the DET11 nucleic acid sequences providedherein, such as SEQ ID NO: 59 and the nucleic acid sequences provided bythe database entries, to detect a DET11 nucleic acid.

The sample nucleic acid, e.g. amplified fragment, can be analyzed by oneof a number of methods known in the art. The nucleic acid can besequenced by dideoxy or other methods. Hybridization with the sequencecan also be used to determine its presence, by Southern blots, dotblots, etc.

The DET nucleic acids of the invention can also be used inpolynucleotide arrays. Polynucleotide arrays provide a high throughputtechnique that can assay a large number of polynucleotide sequences in asingle sample. This technology can be used, for example, as a diagnostictool to identify samples with differential expression of DET nucleicacids as compared to a reference sample.

To create arrays, single-stranded polynucleotide probes can be spottedonto a substrate in a two-dimensional matrix or array. Eachsingle-stranded polynucleotide probe can comprise at least 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, or 30 or more contiguousnucleotides selected from the nucleotide sequences of DET1-DET11. Thesubstrate can be any substrate to which polynucleotide probes can beattached, including but not limited to glass, nitrocellulose, silicon,and nylon. Polynucleotide probes can be bound to the substrate by eithercovalent bonds or by non-specific interactions, such as hydrophobicinteractions. Techniques for constructing arrays and methods of usingthese arrays are described in EP No. 0 799 897; PCT No. WO 97/29212; PCTNo. WO 97/27317; EP No. 0 785 280; PCT No. WO 97/02357; U.S. Pat. Nos.5,593,839; 5,578,832; EP No. 0 728 520; U.S. Pat. No. 5,599,695; EP No.0 721 016; U.S. Pat. No. 5,556,752; PCT No. WO 95/22058; and U.S. Pat.No. 5,631,734. Commercially available polynucleotide arrays, such asAffymetrix GeneChip™, can also be used. Use of the GeneChip™ to detectgene expression is described, for example, in Lockhart et al., NatureBiotechnology 14:1675 (1996); Chee et al., Science 274:610 (1996); Haciaet al., Nature Genetics 14:441, 1996; and Kozal et al., Nature Medicine2:753, 1996.

Tissue samples can be treated to form single-stranded polynucleotides,for example by heating or by chemical denaturation, as is known in theart. The single-stranded polynucleotides in the tissue sample can thenbe labeled and hybridized to the polynucleotide probes on the array.Detectable labels which can be used include but are not limited toradiolabels, biotinylated labels, fluorophors, and chemiluminescentlabels. Double stranded polynucleotides, comprising the labeled samplepolynucleotides bound to polynucleotide probes, can be detected once theunbound portion of the sample is washed away. Detection can be visual orwith computer assistance.

The present invention also provides methods of detecting and measuring aDET protein or fragment thereof. An amino acid sequence for a C21 or f4(DET1) protein is set forth herein as SEQ ID NO: 41. An amino acidsequence for a Hs. 145049 (DET2) protein is set forth herein as SEQ IDNO: 43. An amino acid sequence for a KIT (DET4) protein is set forthherein as SEQ ID NO: 46. An amino acid sequence for a LSM7 (DET5)protein is set forth herein SEQ ID NO: 48. An amino acid sequence for aSYNGR2 (DET6) protein is set forth herein as SEQ ID NO: 50. An aminoacid sequence for a C11 or f8 (DET7) protein is provided herein as SEQID NO: 52. An amino acid sequence for a CDH1 (DET8) protein is set forthherein as SEQ ID NO: 54. An amino acid sequence for a FAM13A1 (DET9)protein is set forth herein as SEQ ID NO: 56. An amino acid sequence forIMPACT (DET10) protein is provided herein as SEQ ID NO: 58. An aminoacid sequence for KIAA1128 (DET11) protein is set forth herein as SEQ IDNO: 60. Therefore, the present invention provides antibodies that bindto the DET protein sequences or fragments thereof set forth herein. Theantibody utilized to detect a DET polypeptide, or fragment thereof, canbe linked to a detectable label either directly or indirectly throughuse of a secondary and/or tertiary antibody; thus, bound antibody,fragment or molecular complex can be detected directly in an ELISA orsimilar assay.

The sample can be on, supported by, or attached to, a substrate whichfacilitates detection. A substrate of the present invention can be, butis not limited to, a microscope slide, a culture dish, a culture flask,a culture plate, a culture chamber, ELISA plates, as well as any othersubstrate that can be used for containing or supporting biologicalsamples for analysis according to the methods of the present invention.The substrate can be of any material suitable for the purposes of thisinvention, such as, for example, glass, plastic, polystyrene, mica andthe like. The substrates of the present invention can be obtained fromcommercial sources or prepared according to standard procedures wellknown in the art.

Conversely, an antibody or fragment thereof, an antigenic fragment of aDET protein can be on, supported by, or attached to a substrate whichfacilitates detection. Such a substrate can be a mobile solid support.Thus, provided by the invention are substrates including one or more ofthe antibodies or antibody fragments, or antigenic fragments of a DETpolypeptide.

In the methods of the present invention, once the expression levels ofone or more DET nucleic acids is measured, these expression levels arecomparing to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid lesion classification is known. Once this comparison isperformed, a difference in expression levels, if present, is identifiedby one of skill in the art.

A difference or alteration in expression of one or more DET nucleicacids in the test cell population, as compared to the reference cellpopulation, indicates that the test cell population is different fromthe reference cell population. By “difference” or “alteration” is meantthat the expression of one or more DET nucleic acid sequences is eitherincreased or decreased as compared to the expression levels of thereference cell population. If desired, but not necessary, relativeexpression levels within the test and reference cell populations can benormalized by reference to the expression level of a nucleic acidsequence that does not vary according to thyroid cancer stage in thesubject. The absence of a difference or alteration in expression of oneor more DET nucleic acids in the test cell population, as compared tothe reference cell population, indicates that the test cell populationis similar to the reference cell population. As an example, if thereference cell population is from normal thyroid tissue, a similar DETgene expression profile in the test cell population indicates that thetest cell population is also normal whereas a different profileindicates that the test cell population is not normal. By “similar” ismeant that an expression pattern does not have to be exactly like theexpression pattern but similar enough such that one of skill in the artwould know that the expression pattern is more closely associated withone type of tissue than with another type of tissue. In another example,if the reference cell population is from malignant thyroid tissue, asimilar DET gene expression profile in the test cell populationindicates that the test cell population is also malignant whereas adifferent profile indicates that the test cell population is notmalignant. Similarly, if the reference cell population is from benignthyroid tissue, a similar DET gene expression profile in the test cellpopulation indicates that the test cell population is also benignwhereas a different profile indicates that the test cell population isnot benign.

Upon observing a difference between the test cell population and anormal reference cell population, one of skill in the art can classifythe test cell population as benign or malignant by comparing theexpression pattern to known expression patterns for benign and malignantcells. This comparison can be done by comparing the expression patternof the test cell population to the expression pattern obtained from aplurality of reference cells used as a control while measuringexpression levels in the test cell population. One of skill in the artcan also compare the expression pattern of the test cell population witha database of expression patterns corresponding to normal, benign andmalignant cells and subcategories thereof. For example, upon observing adifference between the test cell population and a reference cellpopulation from normal thyroid tissue, one of skill in the art cancompare the expression pattern of the test cell population with adatabase of expression patterns corresponding to normal, benign andmalignant cells. One of skill in the art would then determine whichexpression pattern in the database is most similar to the expressionpattern obtained for the test cell population and classify the test cellpopulation as benign or malignant, as well as classify the test cellpopulation as a type of benign or malignant lesion. For example, if thetest cell population is classified as being from a benign lesion, thispopulation can be further classified as being from a follicular adenoma,hyperplastic nodule or papillary adenoma or any other type of benignthyroid lesion. If the test cell population is classified as being froma malignant lesion, this population can be further classified as beingfrom papillary thyroid carcinoma, follicular variant of papillarythyroid carcinoma, follicular carcinoma, Hurthle cell tumor, anaplasticthyroid cancer, medullary thyroid cancer, thyroid lymphoma, poorlydifferentiated thyroid cancer and thyroid angiosarcoma or any other typeof malignant thyroid lesion. Therefore, utilizing the methods of thepresent invention, one of skill in the art can diagnose a benign ormalignant lesion in a subject, as well as the type of benign ormalignant lesion in the subject.

Staging of Thyroid Cancer

Once a subject has been diagnosed with a malignant lesion or thyroidtumor, the stage of thyroid malignancy can also be determined by themethods of the present invention. Staging of a thyroid malignancy ortumor can be useful in prescribing treatment as well as in determining aprognosis for the subject.

Therefore, also provided by the present invention is a method ofidentifying the stage of a thyroid tumor in a subject comprising: a)measuring the expression of one or more nucleic acid sequences selectedfrom the group consisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8,DET9, DET10 and DET11 in a test cell population, wherein at least onecell in said test cell population is capable of expressing one or morenucleic acid sequences selected from the group consisting of DET1, DET2,DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET11; b) comparing theexpression of said nucleic acid sequences to the expression of thenucleic acid sequence(s) in a reference cell population comprising atleast one cell for which a thyroid tumor stage is known; and c)identifying a difference, if present, in expression levels of one ormore nucleic acid sequences selected from the group consisting of DET1,DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET11, in the testcell population and reference cell population, thereby identifying thestage of the thyroid tumor in the subject.

Also provided by the present invention is a method of identifying thestage of a thyroid tumor in a subject comprising: a) measuring theexpression of one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET5 and DET6 in a test cellpopulation, wherein at least one cell in said test cell population iscapable of expressing one or more nucleic acid sequences selected fromthe group consisting of DET1, DET2, DET3, DET4, DET5 and DET6; b)comparing the expression of said nucleic acid sequences to theexpression of the nucleic acid sequence(s) in a reference cellpopulation comprising at least one cell for which a thyroid tumor stageis known; and c) identifying a difference, if present, in expressionlevels of one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET5 and DET6, in the test cellpopulation and reference cell population, thereby identifying the stageof the thyroid tumor in the subject.

Also provided by the present invention is a method of determining aprognosis for subject comprising: a) measuring the expression of one ormore nucleic acid sequences selected from the group consisting of DET1,DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET1 in a test cellpopulation, wherein at least one cell in said test cell population iscapable of expressing one or more nucleic acid sequences selected fromthe group consisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9,DET10 and DET11; b) comparing the expression of said nucleic acidsequences to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid tumor stage is known; and c) identifying a difference, ifpresent, in expression levels of one or more nucleic acid sequencesselected from the group consisting of DET1, DET2, DET3, DET4, DET6,DET7, DET8, DET9, DET10 and DET11, in the test cell population andreference cell population, thereby determining the prognosis for thesubject.

Also provided by the present invention is a method of determining theprognosis for a subject comprising: a) measuring the expression of oneor more nucleic acid sequences selected from the group consisting ofDET1, DET2, DET3, DET4, DET5 and DET6 in a test cell population, whereinat least one cell in said test cell population is capable of expressingone or more nucleic acid sequences selected from the group consisting ofDET1, DET2, DET3, DET4, DET5 and DET6; b) comparing the expression ofsaid nucleic acid sequences to the expression of the nucleic acidsequence(s) in a reference cell population comprising at least one cellfor which a thyroid tumor stage is known; and c) identifying adifference, if present, in expression levels of one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET5 and DET6, in the test cell population and reference cellpopulation, thereby determining the prognosis for the subject.

In staging a thyroid tumor, once the expression levels of one or moreDET nucleic acids is measured, these expression levels are comparing tothe expression of the nucleic acid sequence(s) in a reference cellpopulation comprising at least one cell for which a stage of thyroidtumor is known. Once this comparison is performed, a difference inexpression levels, if present, is identified by one of skill in the art.

A difference or alteration in expression of one or more DET nucleicacids in the test cell population, as compared to the reference cellpopulation, indicates that the test cell population is at a differentstage than the stage of the reference cell population. By “difference”or “alteration” is meant that the expression of one or more DET nucleicacid sequences is either increased or decreased as compared to theexpression levels of the reference cell population. If desired, but notnecessary, relative expression levels within the test and reference cellpopulations can be normalized by reference to the expression level of anucleic acid sequence that does not vary according to thyroid cancerstage in the subject. The absence of a difference or alteration inexpression of one or more DET nucleic acids in the test cell population,as compared to the reference cell population, indicates that the testcell population is at the same stage as that of the reference cellpopulation. As an example, if the reference cell population is from anearly stage thyroid tumor, a similar DET gene expression profile in thetest cell population indicates that the test cell population is alsofrom an early stage thyroid tumor whereas a different profile indicatesthat the test cell population is not from an early stage thyroid tumor.By “similar” is meant that an expression pattern does not have to beexactly like the expression pattern but similar enough such that one ofskill in the art would know that the expression pattern is more closelyassociated with one stage than with another stage.

In order to establish a database of stages of thyroid cancer, oneskilled in the art can measure DET nucleic acid levels and/or DETpolypeptide levels in numerous subjects in order to establish expressionpatterns that correspond to clinically defined stages such as, forexample, 1) normal, 2) at risk of developing thyroid cancer, 3)pre-cancerous or 4) cancerous as well as other substages defined withineach of these stages. These stages are not intended to be limiting asone of skill in the art may define other stages depending on the type ofsample, type of cancer, age of the subject and other factors. Thisdatabase can then be used to compare an expression pattern from a testsample and make clinical decisions. Upon correlation of a DET expressionpattern with a particular stage of thyroid cancer, the skilledpractitioner can administer a therapy suited for the treatment ofcancer. The present invention also allows the skilled artisan tocorrelate a DET expression pattern with a type of thyroid lesion andcorrelate the expression pattern with a particular stage of thyroidcancer. The subjects of this invention undergoing anti-cancer therapycan include subjects undergoing surgery, chemotherapy, radiotherapy,immunotherapy or any combination thereof. Examples of chemotherapeuticagents include cisplatin, 5-fluorouracil and S-1. Immunotherapeuticsmethods include administration of interleukin-2 and interferon-.alpha..

In determining the prognosis for a subject, once the expression levelsof one or more DET nucleic acids is measured, these expression levelsare comparing to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which aprognosis is known. Once this comparison is performed, a difference inexpression levels, if present, is identified by one of skill in the art.

One skilled in the art can measure DET nucleic acid levels and/or DETpolypeptide levels in order to determine a prognosis for a subject. Oneof skill in the art can measure DET nucleic acid levels and/or DETpolypeptide levels in numerous subjects with varying prognoses in orderto establish reference expression patterns that correspond to prognosesfor subjects. As utilized herein, “prognosis” means a prediction ofprobable development and/or outcome of a disease. These referenceexpression patterns or a database of reference expression patterns canthen be used to compare an expression pattern from a test sample anddetermine what the prognosis for a subject is. These expression patternscan also be used to compare an expression pattern from a test samplefrom a subject and determine whether or not a subject can recover fromthe disease. Upon correlation of a DET expression pattern with aparticular prognosis, the skilled practitioner can then determine if atherapy suited for the treatment of cancer is applicable.

The present invention provides a computer system comprising a) adatabase including records comprising a plurality of reference DET geneexpression profiles or patterns for benign, malignant and normal tissuesamples and associated diagnosis and therapy data; and b) a userinterface capable of receiving a selection of one or more test geneexpression profiles for use in determining matches between the testexpression profiles and the reference DET gene expression profiles anddisplaying the records associated with matching expression profiles. Thedatabase can also include DET gene expression profiles for subclasses ofbenign tissue samples such as follicular adenoma, hyperplastic nodule,papillary adenoma, thyroiditis nodule and multinodular goiter. Thedatabase can also include DET gene expression profiles for subclasses ofmalignant tissue samples such as papillary thyroid carcinoma, follicularvariant of papillary thyroid carcinoma, follicular carcinoma, Hurthlecell tumor, anaplastic thyroid cancer, medullary thyroid cancer, thyroidlymphoma, poorly differentiated thyroid cancer and thyroid angiosarcoma.The database can also include DET gene expression profiles for stages ofthyroid cancer as well as DET gene expression profiles that correspondto prognoses for subjects.

It will be appreciated by those skilled in the art that the DET geneexpression profiles provided herein as well as the DET expressionprofiles identified from samples and subjects can be stored, recorded,and manipulated on any medium which can be read and accessed by acomputer. As used herein, the words “recorded” and “stored” refer to aprocess for storing information on a computer medium. A skilled artisancan readily adopt any of the presently known methods for recordinginformation on a computer readable medium to generate a list of DET geneexpression profiles comprising one or more of the DET expressionprofiles of the invention. Another aspect of the present invention is acomputer readable medium having recorded thereon at least 2, 5, 10, 15,20, 25, 30, 50, 100, 200, 250, 300, 400, 500, 1000, 2000, 3000, 4000 or5000 expression profiles of the invention or expression profilesidentified from subjects.

Computer readable media include magnetically readable media, opticallyreadable media, electronically readable media and magnetic/opticalmedia. For example, the computer readable media may be a hard disc, afloppy disc, a magnetic tape, CD-ROM, DVD, RAM, or ROM as well as othertypes of other media known to those skilled in the art.

Embodiments of the present invention include systems, particularlycomputer systems which contain the DET gene expression informationdescribed herein. As used herein, “a computer system” refers to thehardware components, software components, and data storage componentsused to store and/or analyze the DET gene expression profiles of thepresent invention or other DET gene expression profiles. The computersystem preferably includes the computer readable media described above,and a processor for accessing and manipulating the DET gene expressiondata.

Preferably, the computer is a general purpose system that comprises acentral processing unit (CPU), one or more data storage components forstoring data, and one or more data retrieving devices for retrieving thedata stored on the data storage components. A skilled artisan canreadily appreciate that any one of the currently available computersystems are suitable.

In one particular embodiment, the computer system includes a processorconnected to a bus which is connected to a main memory, preferablyimplemented as RAM, and one or more data storage devices, such as a harddrive and/or other computer readable media having data recorded thereon.In some embodiments, the computer system further includes one or moredata retrieving devices for reading the data stored on the data storagecomponents. The data retrieving device may represent, for example, afloppy disk drive, a compact disk drive, a magnetic tape drive, a harddisk drive, a CD-ROM drive, a DVD drive, etc. In some embodiments, thedata storage component is a removable computer readable medium such as afloppy disk, a compact disk, a magnetic tape, etc. containing controllogic and/or data recorded thereon. The computer system mayadvantageously include or be programmed by appropriate software forreading the control logic and/or the data from the data storagecomponent once inserted in the data retrieving device. Software foraccessing and processing the expression profiles of the invention (suchas search tools, compare tools, modeling tools, etc.) may reside in mainmemory during execution.

In some embodiments, the computer system may further comprise a programfor comparing expression profiles stored on a computer readable mediumto another test expression profile on a computer readable medium. An“expression profile comparer” refers to one or more programs which areimplemented on the computer system to compare an expression profile withother expression profiles.

Accordingly, one aspect of the present invention is a computer systemcomprising a processor, a data storage device having stored thereon aDET gene expression profile of the invention, a data storage devicehaving retrievably stored thereon reference DET gene expression profilesto be compared with test or sample sequences and an expression profilecomparer for conducting the comparison. The expression profile comparermay indicate a similarity between the expression profiles compared oridentify a difference between the two expression profiles.

Alternatively, the computer program may be a computer program whichcompares a test expression profile(s) from a subject or a plurality ofsubjects to a reference expression profile (s) in order to determinewhether the test expression profile(s) differs from or is the same as areference expression profile.

This invention also provides for a computer program that correlates DETgene expression profiles with a type of cancer and/or a stage of cancerand/or a prognosis. The computer program can optionally includetreatment options or drug indications for subjects with DET geneexpression profiles associated with a type of cancer and/or stage ofcancer.

Screening Methods

Further provided by the present invention is a method of identifying anagent for treating a thyroid tumor, the method comprising: a) contactinga population of thyroid tumor cells from a subject for which a tumorstage is known, wherein at least one cell in said population is capableof expressing one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11, with a test agent; b) measuring the expression of one or morenucleic acid sequences selected from the group consisting of DET1, DET2,DET3, DET4, DET6, DET7, DET8, DET9, DET10 and DET11 in the cellpopulation; c) comparing the expression of the nucleic acid sequence(s)to the expression of the nucleic acid sequence(s) in a reference cellpopulation comprising at least one cell for which a thyroid tumor stageis known; and d) identifying a difference, if present, in expressionlevels of one or more nucleic acid sequences selected from the groupconsisting of DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11, in the test cell population and reference cell population, suchthat if there is a difference corresponding to an improvement, atherapeutic agent for treating thyroid tumor has been identified.

Further provided by the present invention is a method of identifying anagent for treating a thyroid tumor, the method comprising: a) contactinga population of thyroid tumor cells from a subject for which a tumorstage is known, wherein at least one cell in said test population iscapable of expressing one or more nucleic acid sequences selected fromthe group consisting of DET1, DET2, DET3, DET4, DET5 and DET6, with atest agent; b) measuring the expression of one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET5 and DET6 in the cell population; c) comparing the expression of thenucleic acid sequence(s) to the expression of the nucleic acidsequence(s) in a reference cell population comprising at least one cellfor which a thyroid tumor stage is known; and d) identifying adifference, if present, in expression levels of one or more nucleic acidsequences selected from the group consisting of DET1, DET2, DET3, DET4,DET5 and DET6, in the cell population and reference cell population,such that if there is a difference corresponding to an improvement, atherapeutic agent for treating thyroid tumor has been identified.

The test agents used in the methods described herein can be made bymethods standard in the art and include, but are not limited to,chemicals, small molecules, antisense molecules, siRNAs, drugs,antibodies, peptides and secreted proteins.

By “improvement” is meant that the treatment leads to a shift in athyroid tumor stage to a less advanced stage. As mentioned above, theexpression pattern obtained for the test cell population can be comparedto expression patterns in a database before and after contacting thetest cell population with a test agent to determine the stage of thetest cell population before and after treatment.

The reference cell population can be from normal thyroid tissue. Forexample, if the cell population from the subject is from an early stagethyroid tumor, and after treatment, the expression pattern of the cellpopulation when compared to the reference cell population from normalthyroid tissue, is similar to that of the reference cell population, theagent is effective in treating a thyroid tumor. By “similar” is meantthat the expression pattern does not have to be exactly like theexpression pattern from normal thyroid tissue but similar enough suchthat one of skill in the art would know that the treatment leads toexpression patterns more closely associated with normal thyroid tissue.As an another example, if both the cell population from the subject andthe reference cell population are from an early stage thyroid tumor, andafter treatment, the expression pattern of the cell population issimilar to the reference cell population, the agent is not effective intreating a thyroid tumor. By “similar” is meant that the expressionpattern does not have to be exactly like the expression pattern from theearly stage thyroid tumor cell population but similar enough such thatone of skill in the art would know that the treatment does not lead toan expression pattern corresponding to a less advanced thyroid tumorstage. As another example, if both the cell population from the subjectand the reference cell population are from an early stage thyroid tumor,and after treatment, the expression pattern of the cell population isdifferent from the reference cell population, and correlates with a lessadvanced thyroid tumor stage, the agent is effective in treating athyroid tumor. These examples are not intended to be limiting withregard to the types of thyroid tumor populations that can be contactedwith an agent, the types of agents that can be utilized, the type ofreference cell population that can be utilized or the effects observedas there are numerous variations known to one of skill in the art forperforming these methods.

Treatment Methods

Also provided by the present invention is a method of treating malignantthyroid lesions or thyroid cancer in a subject suffering from or at riskof developing thyroid cancer comprising administering to the subject anagent that modulates the expression of one or more DET sequences. By “atrisk for developing” is meant that the subject's prognosis is lessfavorable and that the subject has an increased likelihood of developingthyroid cancer. Administration of the agent can be prophylactic ortherapeutic.

My “modulation” is meant that the expression of one or more DETsequences can be increased or decreased.

For example, KIT (DET4), LSM7 (DET5), FAM13A1 (DET9), C11 or f8 (DET7),KIAA1128 (DET11), IMPACT (DET10) and CDH1 (DET8) were all downregulatedor underexpressed in malignant thyroid lesions as compared to normalthyroid tissue. Therefore, a subject can be treated with an effectiveamount of an agent that increases the amount of the downregulated orunderexpressed nucleic acids in the subject. Administration can besystemic or local, e.g. in the immediate vicinity of the subject'scancerous cells. This agent can be for example, the protein product of adownregulated or underexpressed DET gene or a biologically activefragment thereof, a nucleic acid encoding a downregulated orunderexpressed DET gene and having expression control sequencespermitting expression in the thyroid cancer cells or an agent whichincreases the endogenous level of expression of the gene.

With regard to genes that are upregulated or overexpressed as comparedto normal thyroid tissue, C21 or f4 (DET1), Hs.145049 (DET2) wereupregulated or overexpressed in malignant thyroid lesions as compared tonormal thyroid tissue. Therefore, a subject can be treated with aneffective amount of an agent that decreases the amount of theupregulated or overexpressed nucleic acids in the subject.Administration can be systemic or local, e.g. in the immediate vicinityof the subject's cancerous cells. The agent can be, for example, anucleic acid that inhibits or antagonizes the expression of theoverexpressed DET gene, such as an antisense nucleic acid or an siRNA.The agent can also be an antibody that binds to a DET protein that isoverexpressed.

In the treatment methods of the present invention, the subject can betreated with one or more agents which decrease the expression ofoverexpressed DET sequences alone or in combination with one or moreagents which increase the expression of DET sequences that aredownregulated or underexpressed in thyroid cancer. The subject can alsobe treated with one or more agents which increase the expression of DETsequences alone or in combination with one or more agents which decreasethe expression of overexpressed DET sequences.

These treatment methods can be combined with other anti-cancertreatments such as surgery, chemotherapy, radiotherapy, immunotherapy orany combination thereof. Examples of chemotherapeutic agents includecisplatin, 5-fluorouracil and S-1. Immunotherapeutics methods includeadministration of interleukin-2 and interferon-.alpha..

Identification of Differentially Expressed Thyroid Genes

The present invention also provides a method of identifyingdifferentially expressed genes and/or expression patterns for such genesin other types of benign and malignant lesions. As set forth in theExamples, one of skill in the art can utilize gene expression profilingand supervised machine learning algorithms to construct a molecularclassification scheme for other types of thyroid tumors. These includeany type of benign lesion such as papillary adenoma, multinodular goiteror thyroiditis nodule, and any type of malignant lesion, such aspapillary thyroid carcinoma, follicular carcinoma, Hurthle cell tumor,anaplastic thyroid cancer, medullary thyroid cancer, thyroid lymphoma,poorly differentiated thyroid cancer and thyroid angiosarcoma. Thosegenes and expression patterns identified via these method can beutilized in the methods of the present invention to diagnose, stage andtreat cancer.

Kits

The present invention also provides for a kit comprising one or morereagents for detecting one or more nucleic acid sequences selected fromthe group consisting of DET1-DET11. In various embodiments theexpression of one or more of the sequences represented by DET1-DET11 aremeasured. The kit can identify the DET nucleic acids by havinghomologous nucleic acid sequences, such as oligonucleotide sequences,complimentary to a portion of the recited nucleic acids, or antibodiesto proteins encoded by the DET nucleic acids. The kit can also includeamplification primers for performing RT-PCR, such as those set forth inTable 4 and probes, such as those set forth in Table 4, that can befluorescently labeled for detecting amplification products in, forexample, a Taqman assay. The kits of the present invention canoptionally include buffers, enzymes, detectable labels and otherreagents for the detecting expression of DET sequences described herein.

The following examples are put forth so as to provide those of ordinaryskill in the art with a complete disclosure and description of how theantibodies, polypeptides, nucleic acids, compositions, and/or methodsclaimed herein are made and evaluated, and are intended to be purelyexemplary of the invention and are not intended to limit the scope ofwhat the inventors regard as their invention. Efforts have been made toensure accuracy with respect to numbers (e.g., amounts, temperature,etc.), but some errors and deviations should be accounted for.

EXAMPLES

DNA microarrays allow quick and complete evaluation of a cell'stranscriptional activity.

Expression genomics is very powerful in that it can generate expressiondata for a large number of genes simultaneously across multiple samples.In cancer research, an intriguing application of expression arraysincludes assessing the molecular components of the neoplastic processand in cancer classification (1). Classification of human cancers intodistinct groups based on their molecular profile rather than theirhistological appearance can be more relevant to specific cancerdiagnoses and cancer treatment regimes. Several attempts to formulate aconsensus about classification and treatment of thyroid carcinoma basedon standard histopathologic analysis have resulted in publishedguidelines for diagnosis and initial disease management (2). In the pastfew decades no improvement has been made in the differential diagnosisof thyroid tumors by fine needle aspiration biopsy (FNA), specificallysuspicious or indeterminate thyroid lesions, suggesting that a newapproach to this should be explored. Therefore in this study a geneexpression approach was developed to diagnose benign vs malignantthyroid lesions in 73 patients with thyroid tumors. A 10 gene and 6 genemodel were developed to be able to differentiate benign vs. malignantthyroid tumors. These results provide a molecular classification systemfor thyroid tumors and this in turn provides a more accurate diagnostictool for the clinician managing patients with suspicious thyroidlesions.

It is well known that cancer results from changes in gene expressionpatterns that are important for cellular regulatory processes such asgrowth, differentiation, DNA duplication, mismatch repair and apoptosis.It is also becoming more apparent that effective treatment and diagnosisof cancer is dependent upon an understanding of these importantprocesses. Classification of human cancers into distinct groups based ontheir origin and histopathological appearance has historically been thefoundation for diagnosis and treatment. This classification is generallybased on cellular architecture, certain unique cellular characteristicsand cell-specific antigens only. In contrast, gene expression assayshave the potential to identify thousands of unique characteristics foreach tumor type (3) (4). Elucidating a genome wide expression patternfor disease states not only could have a enormous impact on ourunderstanding of specific cell biology, but could also provide thenecessary link between molecular genetics and clinical medicine (5) (6)(7).

Thyroid carcinoma represents 1% of all malignant diseases, but 90% ofall neuroendocrine malignancies. It is estimated that 5-10% of thepopulation will develop a clinically significant thyroid nodule duringtheir life-time (8). The best available test in the evaluation of apatient with a thyroid nodule is fine needle aspiration biopsy (FNA)(9). Of the malignant FNAs, the majority are from papillary thyroidcancers (PTC) or its follicular variant (FVPTC). These can be easilydiagnosed if they have the classic cytologic features including abundantcellularity and enlarged nuclei containing intra-nuclear grooves andinclusions (10). Indeed, one third of the time these diagnoses are clearon FNA. Fine needle aspiration biopsy of thyroid nodules has greatlyreduced the need for thyroid surgery and has increased the percentage ofmalignant tumors among excised nodules (11, 12). In addition, thediagnosis of malignant thyroid tumors, combined with effective therapy,has lead to a marked decrease in morbidity due to thyroid cancer.Unfortunately, many thyroid FNAs are not definitively benign ormalignant, yielding an “indeterminate” or “suspicious” diagnosis. Theprevalence of indeterminate FNAs varies, but typically ranges from10-25% of FNAs (13-15). In general, thyroid FNAs are indeterminate dueto overlapping or undefined morphologic criteria for benign versusmalignant lesions, or focal nuclear atypia within otherwise benignspecimens. Of note, twice as many patients are referred for surgery fora suspicious lesion (10%) than for a malignant lesion (5%), anoccurrence that is not widely appreciated since the majority of FNAs arebenign. Therefore when the diagnosis is unclear on FNA these patientsare classified as having a suspicious or indeterminate lesion only. Itis well known that frozen section analysis often yields no additionalinformation.

The question then arises: “Should the surgeon perform a thyroidlobectomy, which is appropriate for benign lesions or a totalthyroidectomy, which is appropriate for malignant lesions when thediagnosis is uncertain both preoperatively and intra-operatively?”Thyroid lobectomy as the initial procedure for every patient with asuspicious FNA could result in the patient with cancer having to undergoa second operation for completion thyroidectomy. Conversely, totalthyroidectomy for all patients with suspicious FNA would result in amajority of patients undergoing an unnecessary surgical procedure,requiring lifelong thyroid hormone replacement and exposure to theinherent risks of surgery (16).

There is a compelling need to develop more accurate initial diagnostictests for evaluating a thyroid nodule. Recent studies suggest that geneexpression data from cDNA microarray analysis holds promise forimproving tumor classification and for predicting response to therapyamong cancer patients (17) (18) (19). No clear consensus existsregarding which computational tool is optimal for the analysis of largegene expression profiling datasets, especially when they are used topredict outcome (20).

This invention describes the use of gene expression profiling andsupervised machine learning algorithms to construct a molecularclassification scheme for thyroid tumors (22). The gene expressionsignatures provided herein include new tumor related genes whose encodedproteins can be useful for improving the diagnosis of thyroid tumors.

Tissue Samples

Thyroid tissues collected under John Hopkins University HospitalInstitutional Review Board-approved protocols were snap-frozen in liquidnitrogen and stored at −80.degree. C. until use. The specimens werechosen based on their tumor type: papillary thyroid carcinoma (PTCn=17), follicular variant of PTC (FVPTC n=15), follicular adenoma (FAn=16) and hyperplastic nodule (HN n=15). All diagnoses were made by theSurgical Pathology Department at Johns Hopkins.

Tissue Processing and Isolation of RNA

Frozen sections of 100-300 mg of tissue were collected in test tubescontaining 1 ml of Trizol. Samples were transferred to FastRNA tubescontaining mini beads and homogenized in a FastPrep beater(Bio101Savant, Carlsbad, Calif.) for 1.5 min at speed 6. The lysate wastransferred to a new tube and total RNA was extracted according to theTrizol protocol (Molecular Research Center, Inc. Cincinnati, Ohio).Approximately 12 ug of total RNA was obtained from each tumor sample.The total RNA was then subjected to two rounds of amplificationfollowing the modified Eberwine method (23) (24) resulting inapproximately 42.mu.g of messenger RNA (mRNA). The quality of theextracted RNA was tested by spectrophotometry and by evaluations onminichips (BioAnalyzer, Agilent Technologies, Palo Alto, Calif.).

Microarray Analysis

Hybridization was performed on 10 k human cDNA microarrays, Hs-UniGem2,produced by the NCl/NIH (ATC, Gaithersburg, Md.). Comparisons were madefor each tumor with the same control which consisted of amplified RNAextracted from normal thyroid tissue and provided by Ambion Inc (Austin,Tex.). Fluorescent marker dyes (Cy5 and Cy3) were used to label the testand control samples, respectively. The respective dyes and samples werealso switched in order to test for any labeling bias. The mixture of thetwo populations of RNA species was then hybridized to the samemicroarray and incubated for 16 hr at 42.degree. C. cDNA microarrayswere then washed and scanned using the GenePix™ 4000B (Axon InstrumentsInc., Calif.) and images were analyzed with GenePix software version3.0. For each sample a file containing the image of the array and anExcel file containing the expression ratio values for each gene wasuploaded onto the MadbArray web-site (National Center for BiotechnologyInformation/NIH) http://nciarray.nci.nih.gov for further analysis. Toaccurately compare measurements from different experiments, the data wasnormalized and the ratio (Signal Cy5/Signal Cy3) was calculated so thatthe median (Ratio) was 1.0.

Immunohistochemistry

Immunohistochemistry studies utilizing antibodies to two gene productsin the predictor models have also been performed and this datacorrelates with the expression data. Taqman analysis was performed forCHD1 and KIT. Both KIT and CDH1 expression decreased in malignancy,which correlates with the microarray data. As shown in FIG. 6,immunohistochemical results show that both KIT and CDH1 expressiondecrease in malignancy which correlates with the expression resultsobtained via microarray and Taqman analysis.

Statistical Analysis

Data from the 73 thyroid tumors was used to build a benign (FA and HN)vs. malignant (PTC and FVPTC) expression ratio-based model, capable ofpredicting the diagnosis (benign vs malignant) of each sample. Afternormalization, a file containing the gene expression ratio values fromall 73 samples was imported into a statistical analysis software package(Partek Inc., Mo.). Samples were divided in two sets: one set (63samples) was used to train the diagnosis predictor model and a secondset (10) was used as a validation set to test the model. These 10samples were not previously used to do any other analysis. As a firststep, the data from the 63 samples was subjected to Principal ComponentAnalysis (PCA) to perform an exploratory analysis and to view theoverall trend of the data. PCA is an exploratory technique thatdescribes the structure of high dimensional data by reducing itsdimensionality. It is a linear transformation that converts n originalvariables (gene expression ratio values) into n new variables orprincipal components (PC) which have three important properties: they 1)are ordered by the amount of variance explained; 2) are uncorrelatedand; 3) they explain all variation in the data. The new observations(each array) are represented by points in a three dimensional space. Thedistance between any pair of points is related to the similarity betweenthe two observations in high dimensional space. Observations that arenear each other are similar for a large number of variables andconversely, the ones that are far apart are different for a large numberof variables.

An Anova test with Bonferroni correction was then used to identify genesthat were statistically different between the two groups. The resultingsignificant genes were used to build a diagnosis-predictor model.Variable (gene) selection analysis with cross-validation was performeddifferent times, each time testing a different number of genecombinations. For cross-validation the “leave-one-out” method was usedto estimate the accuracy of the output class prediction rule: the wholedataset was divided into different parts and each part was individuallyremoved from the data set. The remaining data set was used to train aclass prediction rule; and the resulting rule was applied to predict theclass of the “held-out” sample.

Anova test with Bonferroni correction was used on 9100 genes to identifyones that were statistically different among the 4 groups. PCA analysisof the 63 samples (FIG. 1) using the statistically significant genesshowed a clear organization of the samples based on diagnosis. The sameanalysis (Anova test with Bonferroni correction) was performed on thedataset organized, this time, in benign (HN-FA) and malignant(PTC-FVPTC). For this analysis, 47 genes were found to be significantlydifferent between the benign and the malignant group (Table 1). PCAanalysis also separated the data clearly into two groups (FIG. 2).

For the purpose of this invention, attention was focused on the analysisof the dataset separating benign from malignant. These 47 genes wereused to build a diagnostic predictor model. Variable (gene) selectionanalysis with cross validation was performed with a different number ofgene combinations. After cross-validation the model was 87.1% accuratein predicting benign versus malignant with an error rate of 12.9% (Table2). This suggested that it was possible to use the data to create adiagnostic predictor model.

The most accurate results were obtained with a combination of 6 to 10genes. This combination of genes constituted a predictor model and avalidation set of 10 additional thyroid samples was used to confirm theaccuracy of this model (Table 3). The pathologic diagnosis for eachsample was kept blinded to researchers at the time of the analysis. Whenthe blind was broken, it was found that 9 of the samples were diagnosedin concordance with the pathologic diagnosis by our model. One samplethat was originally diagnosed as a benign tumor by standard histologiccriteria, was diagnosed as malignant by our model. This sample wasre-reviewed by the Pathology Department at The Johns Hopkins Hospitaland was subsequently found to be a neoplasm of uncertain malignantpotential. The diagnosis was changed by pathology after review forclinical reasons, not because of the gene profiling. What is soextraordinary about this is that this was not discovered until thegenotyping suggested that the lesion might be malignant and thepathology report examined a second time. By that time the report hadbeen amended and it suggested that the tumor had undetermined malignantpotential. Regarding the other tumors, all were examined a second timebefore array analysis to be certain that the tissue was representativeand consistent with the pathology report. Therefore, this model wascorrect in assigning the diagnosis in all 10 cases.

PCA analysis using only the six most informative genes was conducted onall the samples with and without the 10 unknown samples (FIG. 3A-B). Itis clear from the PCA organization that the six genes stronglydistinguish benign from malignant. In addition, these same genes can beused for diagnosis with respect to the four subcategories of thyroidlesion. Between the two-predictor models 11 genes are informative.

The identification of markers that can determine a specific type oftumor, predict patient outcome or the tumor response to specifictherapies is currently a major focus of cancer research. This inventionprovides the use of gene expression profiling to build a predictor modelable to distinguish a benign thyroid tumor from a malignant one. Such amodel, when applied to FNA cytology, could greatly impact the clinicalmanagement of patients with suspicious thyroid lesions. To build thepredictor model four types of thyroid lesions, papillary thyroidcarcinoma (PTC), follicular variant of papillary thyroid carcinoma(FVPTC), follicular adenoma (FA) and hyperplastic nodules (FIN) wereused. Taken together, these represent the majority of thyroid lesionsthat often present as “suspicious”. The choice of the appropriatecontrol for comparative array experiments is often the subject of muchdiscussion. In this case, in order to construct a predictive diagnosticalgorithm based on a training set of samples, it was necessary to have a“common” reference standard to which all individual samples arecompared. In this way, differences between each, and in fact all,samples could be analyzed. Had each tumor been compared to the adjacentnormal thyroid tissue from the same patient, it would only be possibleto comment on gene changes within each patient. A source of RNA fromnormal thyroid tissue was chosen since the source was replenishable andcould be used for all of our future experiments once the diagnosticpredictor algorithm was validated.

The mRNA extracted from each sample was amplified. It was found that thequality of the arrays and the data derived from them is superior whenmRNA has been amplified from total RNA. Of note, all samples and allreference controls were amplified in the same fashion. Analysis of theoverall gene expression profiles revealed that the benign lesions (FA,HN) could be distinguished from the malignant lesions (PTC, FVPTC).Furthermore, although not statistically significant, the 4 tumorsub-types appeared to have different gene profiles. The use of apowerful statistical analysis program (Partek) helped discover a groupof 11 genes that were informative enough to create a predictor model.Two combinations were created out of these 11 genes, a combination ofsix genes and a combination of 10 genes. PCA analysis of the six mostinformative genes resulted in a nearly perfect distinction between thetwo groups (FIG. 3A-B). In general, PCA analysis describes similaritiesbetween samples and is not a commonly employed tool for predictingdiagnosis. However, in this study the distinction was so powerful thatit was possible to visually make a correct diagnosis for each of the 10unknown samples (FIG. 3A-B). The predictor model determines the kind oftumor with a specific probability value diagnosis of all 10 unknownsamples was correctly predicted, with a more accurate prediction usingthe six-gene combination (Table 3, see probabilities). It is clear fromthe graph in FIG. 4 how the combination of gene expression values givesa distinctly different profile between the benign and malignant lesions.However, within each tumor group there are differences among theprofiles of the five samples tested. This could be explained by the factthat each tumor, even if of the same type, could be at a different stageof progression.

Of the 11 genes that were informative for the diagnosis, five genes areknown genes and for the other six genes no functional studies are yetavailable. The genes that were identified are the ones that the modelhas determined best group the known samples into their correctdiagnosis. Those genes identified are the ones that consistently groupedthe samples into the categories and subcategories described herein. Thistype of pattern assignment is based on the analysis of thousands ofgenes and the recognition by the computer software that certain patternsare associated repeatedly with certain diagnostic groups. This type ofanalysis derives it power (and significance) by the number of genes thatare analyzed, rather than the degree of up or down regulation of anyparticular gene. With respect to the specific genes identified, thecomputer is not biased by the knowledge of previously identifiedassociated with thyroid cancer. The genes it identifies are those thatbest differentiate the varied diagnoses of the known samples. Thisoccurs during the “training” phase of establishing the algorithm. Oncethe computer is trained with data from comparisons of RNA from knowndiagnoses to a standard reference, unknowns can be tested and fit to thediagnostic groups predicted during the training. For the purposes ofsuch an approach, individual genes are less important. A specific genewhich is found in a univariate study to be associated with thyroidcancer, may not turn out to be the best multivariate predictor of adiagnosis in an analysis such as the one presented here.

TaqMan Assay Utilizing 6 Gene Predictor Model and 10 Gene PredictorModel

Utilizing the information obtained for these differentially expressedgenes TaqMan Real Time PCR analysis for the group of 6 genes and thegroup of 10 genes that are diagnostic for benign versus malignantthyroid lesions from total RNA extracted from thyroid tissue as well asRNA from control normal thyroids was performed. TaqMan Real Time PCRanalysis was also performed for the group of 10 genes that arediagnostic for benign versus malignant thyroid lesions.

Thyroid samples were collected under Johns Hopkins University HospitalInstitutional Review Board-approved protocols. The samples weresnap-frozen in liquid nitrogen and stored at −80.degree. C. until use.The specimens were chosen based on their tumor type: papillary thyroidcancer (PTC); follicular variant of papillary thyroid cancer (FVPTC);follicular adenoma (FA); and hyperplastic nodule (HN). All diagnoseswere made using standard clinical criteria by the Surgical PathologyDepartment at Johns Hopkins University Hospital.

Tissue Processing and Isolation of RNA

Frozen sections of 100-300 mg of tissue were collected in test tubescontaining 1 ml of Trizol. Samples were transferred to FastRNA™ tubescontaining mini beads and homogenized in a FastPrep beater(Bio101Savant™, Carlsbad, Calif.) for 1.5 min at speed 6. The lysate wastransferred to a new tube and total RNA was extracted according to theTrizol protocol in a final volume of 40 .mu.l Rnase-free water(Molecular Research Center, Inc., Cincinnati, Ohio). The quality of theextracted RNA was tested by spectrophotometry and by evaluation onminichips (BioAnalyzer; Agilent Technologies, Palo Alto, Calif.).Minimal criteria for a successful total RNA run were the presence of tworibosomal peaks and one marker peak. Normal human thyroid RNA (Clontech,BD Biosciences) served as a reference control. The total RNA extractedfrom tissue samples and normal thyroid was then used as the template forone round of reverse transcription to generate cDNA. Eight microlitersof purified total RNA (containing up to 3.mu.g of total RNA) was addedto a mix containing 3 .mu.g/l .mu.l of random hexamer primers, 4 .mu.lof 1.times. reverse transcription buffer, 2 .mu.l of DTT, 2 .mu.l ofdNTPs, 1 .mu.l of Rnase inhibitor, and 2 .mu.l of SuperScript II reversetranscriptase (200 U/.mu.l) in a 20 .mu.l reaction volume (all purchasedfrom Invitrogen, Carlsbad, Calif.). Reverse transcription was performedaccording to the SuperScript First-Strand Synthesis System instructions(Invitrogen, Carlsbad, Calif.). Following the reverse transcriptionreaction, the SuperScript II enzyme was heat inactivated, anddegradation of the original template RNA was performed using 2 U/l .mu.lof RNAse H (Invitrogen, Carlsbad, Calif.) for 20 minutes at 37.degree.C. The final volume of the mixture was brought to 500 .mu.l using Rnasefree water and stored at −20.degree. C. until use.

Quantitative Real-Time PCR

For the quantitative analysis of mRNA expression, ABI Prism 7500Sequence Detection System (Applied Biosystems) was used and the dataanalyzed using the Applied Biosystems 7500 System SDS Software Version1.2.2. Primers and probes for the genes of interest and for G3PDH weredesigned using the Primer Express software (version 2.0; AppliedBiosystems). Each primer was designed to produce an approximately 70-150by amplicon. Primer and probe sequences that can be utilized in the 6gene predictor model and the 10 gene predictor model are listed in Table4. Table 4 lists the forward and reverse primer for each gene as well asthe fluorescent probe sequence that was dual labeled. Table 4 alsoprovides the GenBank Accession No. corresponding to each gene and thelocation of the primer and probe sequences within the full-lengthnucleotide sequences provided under the GenBank Accession Nos. Table 4also provides the InCytePD clone number for each gene (if available), aUnigene identification number for each gene (if available), thechromosomal location for each gene, and additional information about theprimers and probes. The primer and probe sequences set forth in Table 4are examples of the primers and probes that can be utilized to amplifyand detect DET1-11. These examples should not be limiting as one ofskill in the art would know that other primer sequences for DET1-DET11including primers comprising the sequences set forth in Table 4 andfragments thereof can be utilized to amplify DET1-DET11. Similarly,other probes which specifically detect DET1-DET11 can be utilized suchas probes that comprise the probe sequences set forth in Table 4 andfragments thereof.

Primers and probes were synthesized by Sigma (sequences shown in Table4; Sigma, The Woodlands, Tex.). Probes were labeled at the 5′ end withthe reporter dye FAM (emission wavelength, 518 nm) and at the 3′ endwith the quencher dye TAMRA (emission wavelength, 582 nm). Standardswere created for the six genes using gel-extracted PCR products (Qiagen,Valencia, Calif.). The G3PDH standard was created using a plasmidconstruct containing the relevant G3PDH sequence (kind gift of Dr.Tetsuya Moriuchi, Osaka University.sup.12). For PCR, 12.5 .mu.l TaqManUniversal PCR Master Mix, 0.5 .mu.l per well each of 0.5 .mu.l forwardand reverse primers, and 0.5 .mu.l per well of 10 .mu.l dual labelledfluorescent probe were combined and adjusted to a total volume of 20.mu.l with Rnase-free water. Finally, 5 .mu.l cDNA per well was added toa total reaction volume of 25 .mu.l. The PCR reaction was performed for40 cycles of a two-step program: denaturation at 95.degree. C. for 15seconds, annealing and extension at 60.degree. C. for 1 minute. Thefluorescence was read at the completion of the 60.degree. C. step. Foreach experiment, a no-template reaction was included as a negativecontrol. Each cDNA sample was tested in triplicate, and the mean valueswere calculated. Triplicate values varied by no more than 10% from themean. We used the standard curve absolute quantification technique toquantify copy number. A standard curve was generated using a ten-folddilution series of four different known concentrations of the standards.The number of PCR cycles required for the threshold detection of thefluorescence signal (cycle threshold or Ct) was determined for eachsample. Ct values of the standard samples were determined and plottedagainst the log amount of standard. Ct values of the unknown sampleswere then compared with the standard curve to determine the amount oftarget in the unknown sample. Standard curves from each experiment werecompared to insure accurate, precise and reproducible results. Eachplate contained duplicate copies of serial dilutions of known standardsand G3PDH, triplicate copies of cDNA from each sample and normal thyroidcDNA for amplification of G3PDH and the gene of interest.

Statistical Analysis

Data from 41 of the thyroid tumors was used to build a benign (FA, n=15;HN, n=10) versus malignant (PTC, n=9; FVPTC, n=7) expression ratio-basedmodel, capable of predicting the diagnosis (benign versus malignant) ofeach sample. Ten additional samples were provided as blinded specimens,processed as described above and used as a validation set to test themodel. These ten samples were not previously used to do any otheranalysis. Expression values of all six genes in all samples and normalthyroid were standardized to the expression of G3PDH, a commonhousekeeping gene chosen to serve as a reference control. The ratio ofthe expression values for each gene in each sample was then compared tothe ratio in normal thyroid, and converted to log 2 to generate a geneexpression ratio value for all 41 samples. A file containing the geneexpression ratio values from all 51 samples (41 known, 10 unknown) wasimported into a statistical analysis software package (Partek, Inc., St.Charles, Mo.).

As a first step, the data from the 41 samples were subjected toprincipal component analysis (PCA) to provide a three-dimensionalvisualization of the data. All six genes were used to build adiagnosis-predictor model called a class prediction rule. This resultingrule was applied to predict the class of the ten samples in thevalidation set. The same analysis was then performed on a second set ofdata from 47 of the thyroid tumors to build a benign (FA, n=15; HN,n=11) versus malignant (PTC, n=9; FVPTC, n=12) expression ratio-basedmodel. Ten additional unstudied samples were provided as blindedspecimens for this second training set.

Principal Component Analysis (PCA) of the 41 samples using the geneexpression values for all six genes showed a clear organization of thesamples based on diagnosis. PCA was then conducted on all of the 41samples with the 10 unknown samples. This combination of genesconstituted a first predictor model and the validation set of 10additional thyroid samples was used to confirm the accuracy of themodel. The pathological diagnosis for each sample was kept blinded untilafter the analysis was completed. When the blind was broken, it wasfound that 8 of the 10 unknown samples were diagnosed by this model inconcordance with the pathological diagnosis determined by standardpathologic criteria. One sample that was originally diagnosed as abenign follicular adenoma by standard histological criteria wasdiagnosed as malignant by the six gene predictor model set forth herein;one sample that was originally diagnosed as a papillary thyroidcarcinoma by standard histological criteria was diagnosed as benign bythe six gene predictor model set forth herein.

Further to the analysis above, the G3PDH standard was redesigned andprocessing of all tissue for total RNA extraction was standardized.Following these two modifications, Principal Component Analysis (PCA)was performed on the second training set of 47 samples and on ten newunknown samples using the gene expression values for all six genes.Again, PCA demonstrated a clear organization of the samples based ondiagnosis. The pathological diagnosis for these ten new unknowns wasalso kept blinded until after the analysis. When the blind was broken,it was found that 9 of the samples were diagnosed in concordance withthe pathological diagnosis by the six gene predictor model set forthherein. One sample that was diagnosed as a benign hyperplastic nodule bystandard histological criteria was diagnosed as malignant by our model.

The results of the Taqman assays correlated with the microarray data. Asshown in FIG. 5, the Taqman data utilizing the 6 gene model (DET1, DET2,DET3, DET4, DET5, DET6) demonstrates the ability to classify a thyroidsample as benign or malignant. Similar to results obtained viamicroarray, c21 or f4, Hs.145049, KIT and LSM-7 were upregulated inbenign samples as compared to malignant samples. In other words, theexpression of c21 or f4, Hs.145049, KIT and LSM7 decreases duringmalignancy. Hs.296031 and SYNGR2 were upregulated in malignant samplesas compared to benign samples. In other words, expression of Hs.296031and SYNGR2 increases during malignancy. The same analysis was performedwith the 10 gene model utilizing the primers and probes set forth inTable 4 for DET1, DET2, DET3, DET4, DET6, DET7, DET8, DET9, DET10 andDET11. As shown in FIG. 7, similar to results obtained via microarray,c21 or f4, Hs.145049 (Hs. 24183), KIT, FAM13A1, C11 or f8, KIAA 1128,IMPACT and CDH1 were upregulated in benign samples as compared tomalignant samples. In other words, the expression of c21 or f4,Hs.145049, KIT, FAM13A1, C11 or f8, KIAA1128, IMPACT and CDH1 decreasesduring malignancy. Hs.296031 and SYNGR2 were upregulated in malignantsamples as compared to benign samples. In other words, expression ofHs.296031 and SYNGR2 increases during malignancy. Therefore, it is clearthat this pattern of differences between malignant and benign samplescan be utilized to classify thyroid lesions utilizing the 6 gene modeland the 10 gene model. In addition to classification, the Real Time PCRTaqman assay can also be used for staging thyroid cancer and inidentifying agents that treat thyroid tumors.

Analysis of the 6 gene expression and the 10 gene expression profilesrevealed that the benign lesions could be distinguished from themalignant lesions, and that this profile could be used to diagnoseunknown samples against the current “gold standard” of pathologiccriteria with a high degree of accuracy. Of the six genes in the sixgene model, downregulation of kit was seen in both benign and malignantthyroid tissue when compared to normal control. The magnitude of thisdownregulation was much greater in malignant thyroid tissue. Kit is awell-known protooncogene.

As to the other five genes in the six gene model, for three of these nofunctional studies are yet available. Of the remaining two genes, SYNGR2has been characterized as an integral vesicle membrane protein. LSM7likewise has been described in the family of Sm-like proteins, possiblyinvolved in pre-mRNA splicing. The interaction of LSM7 with the TACC1complex may participate in breast cancer oncogenesis. However, the roleof LSM7 in thyroid oncogenesis has not yet been explored.

The six gene model determined the accurate diagnosis of 17 out of 20unknown samples tested. Accuracy was based on a comparison to the “goldstandard” pathologic diagnosis as determined by clinical pathologists.Therefore, this strategy demonstrates the power of genomic analysis as atechnique for studying the underlying pathways responsible for thepathophysiology of neuroendocrine tumors. Further evaluation and linkageof clinical data to molecular profiling allows for a betterunderstanding of tumor pathogenesis, or even normal thyroid function anddevelopment. In addition, the use of qRT-PCR can lead to incorporationof this model and/or the 10 gene model into preoperative decision makingfor patients with thyroid nodules.

The present invention is a clear example of how gene-expressionprofiling can provide highly useful diagnostic information. It is likelythat gene expression profiling will be used in the future for clinicaldecision-making. For this purpose adequate reporting of DNA-microarraydata to clinicians will be necessary. Gene-expression profiles may bemore reproducible and clinically applicable than well-established buthighly subjective techniques, such as histopathology. The small numberof genes for which RNA expression levels are diagnostically andprognostically relevant could lead to a robust, affordable, commerciallyavailable testing system. To this end, the present invention provides auseful method for classifying thyroid nodules as benign or malignant andtherefore helps facilitate appropriate, and eliminate unnecessary,operations in patients with suspicious thyroid tumors.

Throughout this application, various publications are referenced. Thedisclosures of these publications in their entireties are herebyincorporated by reference into this application in order to more fullydescribe the state of the art to which this invention pertains.

It will be apparent to those skilled in the art that variousmodifications and variations can be made in the present inventionwithout departing from the scope or spirit of the invention. Otherembodiments of the invention will be apparent to those skilled in theart from consideration of the specification and practice of theinvention disclosed herein. It is intended that the specification andexamples be considered as exemplary only, with a true scope and spiritof the invention being indicated by the following claims.

TABLE 1 Table 1. Two tail Anova analysis with Bonferroni correctionresulted in 47 genes significantly different (p = <0.05) between themalignant and the benign group. Gene Bonferroni p-value Mean(benign)S.D. +/− Mean(malignant) S.D. +/− C21orf4 <0.0001 1.54 0.36 0.92 0.36KIT <0.0001 1.20 0.66 0.38 0.32 FLJ20477 <0.0001 1.16 0.28 0.76 0.22MGC4276 0.0001 1.02 0.37 0.54 0.22 KIAA0062 0.001 1.03 0.51 0.46 0.25CDH1 0.001 1.51 0.46 0.87 0.45 LSM7 0.001 1.28 0.53 0.69 0.27 ACYP1<0.01 2.11 0.91 1.09 0.51 SYNGR2 <0.01 0.75 0.41 1.67 1.05 XPA <0.012.29 0.84 1.31 0.58 AD-017 <0.01 1.57 0.63 0.84 0.44 DP1 <0.01 1.59 0.690.84 0.39 IDI1 <0.01 1.37 0.61 0.74 0.29 RODH <0.01 1.36 0.93 0.45 0.36ID4 <0.01 1.10 0.56 0.48 0.37 Hs.24183 <0.01 2.05 0.70 1.30 0.42 HTCD37<0.01 1.22 0.37 0.78 0.30 DUSP5 <0.01 0.97 0.60 3.93 3.15 Hs.87327 <0.011.54 0.53 1.01 0.26 CRNKL1 0.01 1.33 0.49 0.79 0.34 LOC54499 0.01 1.330.50 0.83 0.26 RAP140 0.01 1.60 0.58 1.00 0.35 MAPK4 0.01 0.66 0.38 0.300.16 Hs.296031 0.01 1.13 0.63 2.28 1.12 ATP6V1D 0.01 1.71 0.75 0.94 0.46TXNL 0.01 1.19 0.66 0.57 0.28 FAM13A1 0.02 1.35 0.60 0.71 0.43 GUK1 0.020.87 0.43 1.56 0.66 Hs.383203 0.02 1.55 0.57 0.91 0.45 C11orf8 0.02 0.810.43 0.36 0.30 DENR 0.02 1.54 0.42 1.02 0.42 PRDX1 0.02 1.36 0.40 0.840.44 FLJ20534 0.02 1.94 0.92 1.08 0.40 DI02 0.02 1.95 1.37 0.70 0.52C21orf51 0.02 1.01 0.40 0.63 0.22 KIAA1128 0.03 1.76 0.87 0.90 0.52IMPACT 0.03 1.32 0.48 0.86 0.27 KIAA0089 0.03 1.43 0.63 0.76 0.49HSD1784 0.03 1.45 0.57 0.88 0.36 MAP4K5 0.04 1.59 0.61 0.97 0.44 ELF30.04 0.82 0.24 1.45 0.72 ALDH7A1 0.04 1.61 0.52 0.96 0.58 BET1 0.04 1.380.55 0.82 0.39 GTF2H2 0.04 1.80 0.54 1.23 0.44 DC6 0.04 1.19 0.34 0.810.29 CDH1 0.04 1.31 0.49 0.82 0.34 The genes are listed from the most tothe least significant. In bold are all the genes that combined togethercreated the best predictor model.

TABLE 2 Table 2. Results of the cross validation analysis using the“leave- one-out” method (see materials and methods). # per Class #Correct # Error % Correct % Error Benign 31 27 4 87.1 12.9 Malignant 3228 4 87.5 12.5 Total 63 55 8 87.3 12.7 Normalized 87.3 12.7 Thepredictor model was able to correctly predict 87% of the diagnoses. Theoutcome is called a confusion matrix.

TABLE 3 DIAGNOSIS PREDICTOR MODEL 31 benign tumors 32 malignant tumorsTable 3. In this table the two predictor model of 10 and 6 genes isshown with their gene expression values, the predicted diagnosis, thepercentage probability of the diagnosis being correct and the pathologicdiagnosis. 10 gene diagnose cross validation % % ma- Pre- Patho-predictor model of 83% IM- benign lignant dicted logic C11orf8 C21orf4CDH1 FAM13A1 Hs.24183 Hs.288031 PACT KIAA1128 KIT SYNGR2 prob. prob.Diagnosis Diagnosis 0.4561 1.35 1.53 0.76 1.81 1.55 1.02 1.21 2.03 1.120.99 0.01 benign FA 0.4988 0.82 0.83 0.45 1.67 1.74 0.93 1.27 0.27 0.540.02 0.98 malignant FVPTC 1.311 0.78 2.13 1.13 1.39 0.65 1.36 1.19 1.701.04 0.91 0.09 benign HN 0.5143 1.05 0.62 0.85 0.95 1.56 1.16 0.86 0.800.78 0.43 0.57 malignant PTC 0.3786 2.07 0.64 1.44 1.84 1.51 0.48 1.141.32 2.65 0.94 0.06 benign FA 0.7376 1.81 0.85 1.85 1.34 0.55 0.91 1.561.83 2.70 1.00 0.00 benign FA 0.1206 0.57 0.50 0.55 0.86 1.94 0.61 0.990.25 4.88 0.00 1.00 malignant PTC 0.026 1.27 0.46 0.59 1.22 1.19 0.910.56 0.11 4.69 0.00 1.00 malignant PTC 0.1097 0.70 2.17 1.01 1.24 0.820.96 0.93 1.59 3.69 0.05 0.95 malignant HN 1.0368 1.37 1.24 1.50 1.231.74 0.94 1.82 2.82 1.38 1.00 0.00 benign HN 6 gene diagnose predictormodel cross validation of 87% % benign % malignant Predicted PathologicC21orf4 Hs.24183 Hs.296031 KIT LSM7 SYNGR2 prob. prob. DiagnosisDiagnosis 1.3518 1.81 1.55 2.03 2.40 1.12 1.00 0.00 benign FA 0.819 1.671.74 0.27 0.56 0.54 0.15 0.85 malignant FVPTC 0.7822 1.39 0.65 1.70 1.331.04 0.94 0.06 benign HN 1.0457 0.95 1.56 0.80 0.85 0.79 0.33 0.67malignant PTC 2.0723 1.84 1.51 1.32 1.05 2.65 0.06 0.14 benign FA 1.80531.34 0.65 1.83 1.47 2.70 0.96 0.04 benign FA 0.5555 0.85 1.94 0.25 0.664.88 0.00 1.00 malignant PTC 1.2698 1.22 1.19 0.11 0.43 4.69 0.00 1.00malignant PTC 0.698 1.24 0.82 1.69 1.60 3.69 0.10 0.90 malignant HN1.3677 1.23 1.74 2.92 1.04 1.38 0.99 0.01 benign HN FA = follicularadenoma, HN = hyperplastic nodules, FVPTC = follicular variant papillarythyroid carcinoma and PTC = papillary thyroid carcinoma. The squareindicates the unknown sample for which there was discordance between thepredicted and the pathologic diagnosis. The percentage diagnosisprobability for both 6 and 10 gene combinations strongly suggested thatthis was a malignant sample. The sample was re-reviewed by thepathologist and the pathologic diagnosis was in-fact changed to aneoplasm with uncertain malignant potential.

TABLE 4 Table 4. This table shows the primer and probe sequences thatcan be utilized in the 6 gene predictor model and the 10 gene predictormodel. Table 4 Thyroid Primer/Probes InCyta Oligo Name LengthSequence(5′-3′) Tm Residues PO_Clone Hs.24183-Forward SEQ ID NO. 1 22ggcgactggcaaaagag 2438-2457 2123020 Hs.24183-Reverse SEQ ID NO. 2 26

2530-2506 2123020 Hs.24183-Probe SEQ ID NO. 3 23 (

)TggCCTgTCACTCCCATgATgC(Tamra) 2462-2484 2123020

globulin-forward SEQ ID NO. 4 18 aagggctcgcatgcaag 59 2036-2053

globulin-reverse SEQ ID NO. 5 25 cacagtagcactcg

60 2157-2133

globulin-probe SEQ ID NO. 6 33 (

)TTTgTCCCTgCTTgTACTAgTgAgg(Tamra) 69 2088-2120 c21orf4-forward SEQ IDNO. 7 22 gctatcctcttacctcccgt

2822-2643 1710736 c21orf4-reverse SEQ ID NO. 8 25 gga

2743-2712 1710736 c21orf4-Probe SEQ ID NO. 9 28 (

)CTgcgACACAgATgTATCCTCCACTCC(Tamra) 2652-2679 1710736 fam13a1-forwardSEQ ID NO. 10 22

2931-2952 1458358 fam13a1-reverse SEQ ID NO. 11 25 gca

3058-3034 1458388 fam13a1-Probe SEQ ID NO. 12 23 (

)TgTTTgTggAATCCATgAAggTTATggC(Tamra) 2992-3014 1458388 c11orf8-forwardSEQ ID NO. 13 16 ccggcccagc

849-864 4117578 c11orf8-reverse SEQ ID NO. 14 21 gtg

916-896 4117578 c11orf8-Probe SEQ ID NO. 15 29 (

)TgTTTggTggAATCCATgAAggTTATggC(Tamra) 866-894 4117578 kiaa1128-forwardSEQ ID NO. 16 20 gagagcg

5980-5999 1428225 kiaa1128-reverse SEQ ID NO. 17 23

6083-8041 1428225 kiaa1128-probe SEQ ID NO. 18 33 (

)TCACTTCCAAATgTTCCTgTAgCATAAATggTg(Tamra) 6004-6036 1428225Hs.298031-forward SEQ ID NO. 19 24 gcaaaggag

4271-4294 29557644  Hs.298031-reverse SEQ ID NO. 20 20 atgacggcatg

4353-4334 29557644  Hs.298031-probe SEQ ID NO. 21 29 (

)TTggTCCCCTCAgTTCTATgCTgTTgTgT(Tamra) 4301-4329 29557644 

-forward SEQ ID NO. 22 26 gcactgc

2704-2129 2358031/ 1572225

-reverse SEQ ID NO. 23 28

2643-2816 2358031/ 1572225

-probe SEQ ID NO. 24 36 (

)ATTgTTCAgCTAATTgAgAAgCAgATTTCAgAgAgC(Tamra) 2779-2814 2358031/ 1572225impact-forward SEQ ID NO. 25 26 gaagaa

809-864  983008 impact-reverse SEQ ID NO. 26 25 atgc

943-918  983008 impact-probe SEQ ID NO. 27 29 (

)CTggTATggAgggATTCTgCTAggACCAg(Tamra) 837-865  983008 cdh1-forward SEQID NO. 28 21 gagtg

2499-2519 1911913/ 2060560 cdh1-reverse SEQ ID NO. 29 21 cagccgccag

2579-2559 1911913/ 2060560 cdh1-probe SEQ ID NO. 30 27 (

)CCTgCCAATCCCgATgAAATTggAAAT(Tamra) 2525-2551 1911913/ 2060560syngr2-forward SEQ ID NO. 31 18 gctgg

1255-1273 syngr2-reverse SEQ ID NO. 32 19 ccct

1374-1356 syngr2-probe SEQ ID NO. 33 24 (

)aagggcttgcctgaca

(Tamra) 1303-1328 lsm7-forward SEQ ID NO. 34 21 gacg

72-82 lsm7-reverse SEQ ID NO. 35 20 agg

148-127 lsm7-probe SEQ ID NO. 36 22 (

)aggcccg

(Tamra)  95-117 G3PDH-Forward SEQ ID NO. 37 22 TCACCAGGGCTGCTTTTAACTC128-149 G3PDH-Reverse SEQ ID NO. 38 25 GGAATCATATTGGAACATGTAAACCA228-203 G3PDH-probe SEQ ID NO. 39 27FAM-TTGCCATCAATGACCCCTTCATTGACC-TAMRA 167-193 normal thyroid Coated Lot63100784 paded 65 autopsy sample patients Table 4 Thyroid Primer/Probes

CM Paper TAOman Primer/Probe Oligo Name Unigene GenBank/RefSeqGenBank/RefSeq Chromosome Details Hs.24183-Forward Hs.24183 KP060255ALB32414.1 21 used later part of sequence Hs.24183-Reverse Hs.24183KP060255 ALB32414.1 Hs.24183-Probe Hs.24183 KP060255 ALB32414.1

globulin-forward NM_033235 NM_003235 used with Exon9

globulin-reverse NM_033235 NM_003235

globulin-probe NM_033235 NM_003235 c21orf4-forward(Hs.284142-rel)Hs.433668 AP001717 NM_006134.4 21q22.11 spans Exon7-8c21orf4-reverse (Hs.284142-rel)Hs.433668 AP001717 NM_006134.4c21orf4-Probe (Hs.284142-rel)Hs.433668 AP001717 NM_006134.4fam13a1-forward (Hs.177644-removed)Hs.442818 (NM0148883)fromA8020721(NM014883)fromAB020721 4q22.1 used later part of seg-exon19fam13a1-reverse (Hs.177644-removed)Hs.442818 (NM0148883)fromA8020721(NM014883)fromAB020721 fam13a1-Probe (Hs.177644-removed)Hs.442818(NM0148883)fromA8020721 (NM014883)fromAB020721 c11orf8-forward(Hs.45638-rel)Hs.432000 NM001584 NM001584 11p13 spans Exon5-8c11orf8-reverse (Hs.45638-rel)Hs.432000 NM001584 NM001584 c11orf8-Probe(Hs.45638-rel)Hs.432000 NM001584 NM001584 kiaa1128-forward Hs.81897AB032914.1-this is AB032954.1 10q23.2 used later actually AB032954.1part of sequence kiaa1128-reverse Hs.81897 AB032914.1-this is AB032954.1actually AB032954.1 kiaa1128-probe Hs.81897 AB032914.1-this isAB032954.1 actually AB032954.1 Hs.298031-forward Hs.296031 BC38512.1BC38512.1 X used later part of sequence Hs.298031-reverse Hs.296031BC38512.1 BC38512.1 Hs.298031-probe Hs.296031 BC38512.1 BC38512.1

-forward Hs.81665 X05182.1 X05182.1 4q11-q12 spans Exon 19-20

-reverse Hs.81665 X05182.1 X05182.1

-probe Hs.81665 X05182.1 X05182.1 impact-forward Hs.284245 NM018439NM018439 18q11.2-q12.1 spans Exon 10-11 impact-reverse Hs.284245NM018439 NM018439 impact-probe Hs.284245 NM018439 NM018439 cdh1-forwardHS 194857 NM004350 NM004350 16q22.1 spans Exon 15-18 cdh1-reverse HS194857 NM004350 NM004350 cdh1-probe HS 194857 NM004350 NM004350syngr2-forward (Hs.5097-rel) Hs.433753 NM004710.2 NM004710.2 17q25.3used later sequence syngr2-reverse (Hs.5097-rel) Hs.433753 NM004710.2NM004710.2 syngr2-probe (Hs.5097-rel) Hs.433753 NM004710.2 NM004710.2lsm7-forward (Hs.70830-rel)Hs.512610 NM0151991.1 NM0151991.1 19p13.3used later sequence lsm7-reverse (Hs.70830-rel)Hs.512610 NM0151991.1NM0151991.1 lsm7-probe (Hs.70830-rel)Hs.512610 NM0151991.1 NM0151991.1G3PDH-Forward NM_002048 from Takahashi paper G3PDH-Reverse NM_002048G3PDH-probe NM_002048 normal thyroid 650-424-8222 sample

indicates data missing or illegible when filed

1. A method of identifying the stage of a thyroid tumor in a subjectcomprising: a) measuring the expression of one or more nucleic acidsequences selected from the group consisting of differentially expressedthyroid (DET) gene C21 or f4, Hs.145049, Hs.296031, KIT, LSM7, andSYNGR2 in a test cell population obtained from the thyroid tumor in thesubject, wherein at least one cell in said test cell population iscapable of expressing one or more nucleic acid sequences selected fromthe group consisting of DET gene C21 or f4, Hs.145049, Hs.296031, KIT,LSM7, and SYNGR2; b) comparing the expression of the nucleic acidsequence(s) to the expression of the nucleic acid sequence(s) in areference cell population comprising at least one cell for which athyroid tumor stage is known; and c) identifying a difference, ifpresent, in expression levels of one or more nucleic acid sequencesselected from the group consisting of DET gene C21 or f4, Hs.145049,Hs.296031, KIT, LSM7, and SYNGR2, in the test cell population andreference cell population, thereby identifying the stage of the thyroidtumor in the subject.
 2. The method of claim 1, wherein a difference inthe expression of the nucleic acid(s) in the test cell population ascompared to the reference cell population indicates that the test cellpopulation has a different stage than the cells from the reference cellpopulation.
 3. The method of claim 1, wherein a similar expressionpattern of the nucleic acid(s) in the test cell population as comparedto the reference cell population indicates that the test cell populationhas the same thyroid tumor stage as the cells from the reference cellpopulation.
 4. The method of claim 1, wherein the reference cellpopulation is a plurality of cells or a database.
 5. The method of claim1, wherein the subject is a human.
 6. The method of claim 1, wherein thetumor or thyroid lesion is selected from the group consisting of:papillary thyroid carcinoma, follicular variant of papillary thyroidcarcinoma, follicular carcinoma, Hurthle cell tumor, anaplastic thyroidcancer, medullary thyroid cancer, thyroid lymphoma, poorlydifferentiated thyroid cancer and thyroid angiosarcoma.
 7. The method ofclaim 1, wherein expression of the nucleic acid(s) is measured bymicroarray.
 8. The method of claim 1, wherein expression of the nucleicacid(s) is measured by probing the nucleic acid(s).
 9. The method ofclaim 1, wherein expression of the nucleic acids(s) is measured byamplifying the nucleic acid(s).
 10. The method of claim 1, wherein theexpression of the nucleic acid(s) is measured by amplifying the nucleicacid(s) and detecting the amplified nucleic acid with a fluorescentprobe.
 11. The method of claim 10, wherein C21 or f4 nucleic acid isamplified utilizing forward primer GCAATCCTCTTACCTCCGCTTT (SEQ ID NO: 7)and reverse primer GGAATCGGAGACAGAAGAGAGCTT (SEQ ID NO: 8) and whereinthe amplified nucleic acid is detected with a probe comprising thenucleic acid sequence CTGGGACCACAGATGTATCCTCCACTCC (SEQ ID NO: 9) linkedto a fluorescent label.
 12. The method of claim 10, wherein Hs.145049nucleic acid is amplified utilizing forward primerGGCTGACTGGCAAAAAGTCTTG (SEQ ID NO: 1) and reverse primerTTGGTTCCCTTAAGTTCTCAGAGTTT (SEQ ID NO: 2) and wherein the amplifiednucleic acid is detected with a probe comprising the nucleic acidsequence TGGCCCTGTCACTCCCATGATGC (SEQ ID NO: 3) linked to a fluorescentlabel.
 13. The method of claim 10, wherein Hs.296031 nucleic acid isamplified utilizing forward primer TGCCAAGGAGCTTTGTTTATAGAA (SEQ ID NO:19) and reverse primer ATGACGGCATGTACCAACCA (SEQ ID NO: 20) and whereinthe amplified nucleic acid is detected with a probe comprising thenucleic acid sequence TTGGTCCCCTCAGTTCTATGCTGTTGTGT (SEQ ID NO: 21)linked to a fluorescent label.
 14. The method of claim 10, wherein KITnucleic acid is amplified utilizing forward primerGCACCTGCTGAAATGTATGACATAAT (SEQ ID NO: 22) and reverse primerTTTGCTAAGTTGGAGTAAATATGATTGG (SEQ ID NO: 23) and wherein the amplifiednucleic acid is detected with a probe comprising the nucleic acidsequence ATTGTTCAGCTAATTGAGAAGCAGATTTCAGAGAGC (SEQ ID NO: 24) linked toa fluorescent label.
 15. The method of claim 10, wherein LSM7 nucleicacid is amplified utilizing forward primer GACGATCCGGGTAAAGTTCCA (SEQ IDNO: 34) and reverse primer AGGTTGAGGAGTGGGTCGAA (SEQ ID NO: 35) andwherein the amplified nucleic acid is detected with a probe comprisingthe nucleic acid sequence AGGCCGCGAAGCCAGTGGAATC (SEQ ID NO: 36) linkedto a fluorescent label.
 16. The method of claim 10, wherein SYNGR2nucleic acid is amplified utilizing forward primer GCTGGTGCTCATGGCACTT(SEQ ID NO: 31) and reverse primer CCCTCCCCAGGCTTCCTAA (SEQ ID NO: 32)and wherein the amplified nucleic acid is detected with a probecomprising the nucleic acid sequence AAGGGCTTTGCCTGACAACACCCA (SEQ IDNO: 33) linked to a fluorescent label.
 17. The method of claim 1,wherein expression of the nucleic acid(s) is measured by detecting theprotein expression product of the nucleic acid(s).
 18. A method ofidentifying the stage of a thyroid tumor in a subject comprising: a)measuring the expression of one or more nucleic acid sequences selectedfrom the group consisting of differentially expressed thyroid (DET) geneC21 or f4, Hs.145049, Hs.296031, KIT, SYNGR2, C11 or f8, CDH1, FAM13A1,IMPACT, and KIAA1128 in a test cell population obtained from the thyroidtumor in the subject, wherein at least one cell in said test cellpopulation is capable of expressing one or more nucleic acid sequencesselected from the group consisting of DET gene C21 or f4, Hs.145049,Hs.296031, KIT, SYNGR2, C11 or f8, CDH1, FAM13A1, IMPACT, and KIAA1128;b) comparing the expression of the nucleic acid sequence(s) to theexpression of the nucleic acid sequence(s) in a reference cellpopulation comprising at least one cell for which a thyroid tumor stageis known; and c) identifying a difference, if present, in expressionlevels of one or more nucleic acid sequences selected from the groupconsisting of DET gene C21 or f4, Hs.145049, Hs.296031, KIT, SYNGR2, C11or f8, CDH1, FAM13A1, IMPACT, and KIAA1128, in the test cell populationand reference cell population, thereby identifying the stage of thethyroid tumor in the subject.
 19. The method of claim 18, wherein adifference in the expression of the nucleic acid(s) in the test cellpopulation as compared to the reference cell population indicates thatthe test cell population has a different stage than the cells from thereference cell population.
 20. The method of claim 18, wherein a similarexpression pattern of the nucleic acid(s) in the test cell population ascompared to the reference cell population indicates that the test cellpopulation has the same thyroid carcinoma stage as the cells from thereference cell population.
 21. The method of claim 18, wherein thereference cell population is a plurality of cells or a database.
 22. Themethod of claim 18, wherein the subject is a human.
 23. The method ofclaim 18, wherein the thyroid tumor is selected from the groupconsisting of: papillary thyroid carcinoma, follicular variant ofpapillary thyroid carcinoma, follicular carcinoma, Hurthle cell tumor,anaplastic thyroid cancer, medullary thyroid cancer, thyroid lymphoma,poorly differentiated thyroid cancer and thyroid angiosarcoma.
 24. Themethod of claim 18, wherein expression of the nucleic acid(s) ismeasured by microarray.
 25. The method of claim 18, wherein expressionof the nucleic acid(s) is measured by probing the nucleic acid(s). 26.The method of claim 18, wherein expression of the nucleic acids(s) ismeasured by amplifying the nucleic acid(s).
 27. The method of claim 18,wherein the expression of the nucleic acid(s) is measured by amplifyingthe nucleic acid(s) and detecting the amplified nucleic acid with afluorescent probe.
 28. The method of claim 27, wherein C21 or f4 nucleicacid is amplified utilizing forward primer GCAATCCTCTTACCTCCGCTTT (SEQID NO: 7) and reverse primer GGAATCGGAGACAGAAGAGAGCTT (SEQ ID NO: 8) andwherein the amplified nucleic acid is detected with a probe comprisingthe nucleic acid sequence CTGGGACCACAGATGTATCCTCCACTCC (SEQ ID NO: 9)linked to a fluorescent label.
 29. The method of claim 27, whereinHs.145049 nucleic acid is amplified utilizing forward primerGGCTGACTGGCAAAAAGTCTTG (SEQ ID NO: 1) and reverse primerTTGGTTCCCTTAAGTTCTCAGAGTTT (SEQ ID NO: 2) and wherein the amplifiednucleic acid is detected with a probe comprising the nucleic acidsequence TGGCCCTGTCACTCCCATGATGC (SEQ ID NO: 3) linked to a fluorescentlabel.
 30. The method of claim 27, wherein Hs.296031 nucleic acid can beamplified utilizing forward primer TGCCAAGGAGCTTTGTTTATAGAA (SEQ ID NO:19) and reverse primer ATGACGGCATGTACCAACCA (SEQ ID NO: 20) and whereinthe amplified nucleic acid is detected with a probe comprising thenucleic acid sequence TTGGTCCCCTCAGTTCTATGCTGTTGTGT (SEQ ID NO: 21)linked to a fluorescent label.
 31. The method of claim 27, wherein KITnucleic acid is amplified utilizing forward primerGCACCTGCTGAAATGTATGACATAAT (SEQ ID NO: 22) and reverse primerTTTGCTAAGTTGGAGTAAATATGATTGG (SEQ ID NO: 23) and wherein the amplifiednucleic acid is detected with a probe comprising the nucleic acidsequence ATTGTTCAGCTAATTGAGAAGCAGATTTCAGAGAGC (SEQ ID NO: 24) linked toa fluorescent label.
 32. The method of claim 27, wherein SYNGR2 nucleicacid is amplified utilizing forward primer GCTGGTGCTCATGGCACTT (SEQ IDNO: 31) and reverse primer CCCTCCCCAGGCTTCCTAA (SEQ ID NO: 32) andwherein the amplified nucleic acid is detected with a probe comprisingthe nucleic acid sequence AAGGGCTTTGCCTGACAACACCCA (SEQ ID NO: 33)linked to a fluorescent label.
 33. The method of claim 27, wherein C11or f8 nucleic acid is amplified utilizing forward primerCCGGCCCAAGCTCCAT (SEQ ID NO: 13) and reverse primerTTGTGTAACCGTCGGTCATGA (SEQ ID NO: 14) and wherein the amplified nucleicacid is detected with a probe comprising the nucleic acid sequenceTGTTTGGTGGAATCCATGAAGGTTATGGC (SEQ ID NO: 15) linked to a fluorescentlabel.
 34. The method of claim 27, wherein CDH1 nucleic acid isamplified utilizing forward primer TGAGTGTCCCCCGGTATCTTC (SEQ ID NO: 28)and reverse primer CAGCCGCTTTCAGATTTTCAT (SEQ ID NO: 29) and wherein theamplified nucleic acid is detected with a probe comprising the nucleicacid sequence CCTGCCAATCCCGATGAAATTGGAAAT (SEQ ID NO: 30) linked to afluorescent label.
 35. The method of claim 27, wherein IMPACT nucleicacid is amplified utilizing forward primer ATGGCAGTGCAGTCATCATCTT (SEQID NO: 10) and reverse primer GCATTCATACAGCTGCTTACCATCT (SEQ ID NO: 11)and the amplified nucleic acid is detected with a probe comprising thenucleic acid sequence TTTGGTCCCTGCCTAGGACCGGG (SEQ ID NO: 12) linked toa fluorescent label.
 36. The method of claim 27, wherein FAM13A1 nucleicacid is amplified utilizing forward primer TGAAGAATGTCATGGTGGTAGTATCA(SEQ ID NO: 25) and reverse primer ATGACTCCTCAGGTGAATTTGTGTAG (SEQ IDNO: 26) and wherein the amplified nucleic acid is detected with a probecomprising the nucleic acid sequence CTGGTATGGAGGGATTCTGCTAGGACCAG (SEQID NO: 27) linked to a fluorescent label.
 37. The method of claim 27,wherein KIAA1128 nucleic acid is amplified utilizing forward primerGAGAGCGTGATCCCCCTACA (SEQ ID NO: 16) and reverse primerACCAAGAGTGCACCTCAGTGTCT (SEQ ID NO: 17) and the amplified nucleic acidis detected with a probe comprising the nucleic acid sequenceTCACTTCCAAATGTTCCTGTAGCATAAATGGTG (SEQ ID NO: 18) linked to afluorescent label.
 38. The method of claim 18, wherein expression of thenucleic acid(s) is measured by detecting the protein expression productof the nucleic acid(s).